I liked the idea and tried doing a design of my own. The thing I didn't like was that you now split up an operation into multiple instructions which couldn't operate concurrently, and I couldn't see how that could be sped up given instruction bus speed limits.
What I figured was to make the functional units more complex, so instead of having two inputs (left and right operand, implicit function), they'd also take an op code. This meant that I could reduce the number of addresses enough that a single move instruction could be packed into one byte. I don't recall for sure, but I think I used two bits to indicate the bus the move operated on, so you could get three moves happening at once (I think the last 2-bit pattern was reserved for special operations, but I don't recall what they were).
Branches were straightforward, in that the instruction read unit was just another functional unit, with left, right, and op input, you could just transfer the output from a logic/comparator unit to the op input of the instruction read unit to jump to the new (relative, I think) address or not.
Constants were defined by a special instruction unit operation which would accumulate 1, 2, or 4 subsequent bytes into the output register, ready to be moved elsewhere (as well as regular load/store from memory).
There was also a dedicated register file, where the op code was the register to read/write. Just in case the functional unit input/output registers weren't adequate.
I liked this idea because there'd be no speed penalty - in fact, a typical "regular" instruction would only be 3 bytes, so with the same input bottleneck it could even be faster.
It didn't get beyond a high level block diagram and instruction/unit descriptions. I'm sure I have a copy of it somewhere, but it got lost in a move (ironically).