They're not the only ones. The IBM mainframes have long been VMs implemented on top of various microcode platforms.
But the microcode implemented part or all of an interpreter for the machine code; the instructions weren't translated into directly-executed microcode. (And the System/360 Model 75 did it all in hardware, with no microcode).
And the "instruction set" for the microcode was often rather close to the hardware, with extremely little in the way of "instruction decoding" of microinstructions, although I think some lower-end machines might have had microinstructions that didn't look too different from a regular instruction set. (Some might have been IBM 801s.)
So that's not exactly the same thing as what the Pentium Pro and successors, the Nx586, and the AMD K5 and successors, do.
Currently mainframe processors, however, as far as I know 1) execute most instructions directly in hardware, 2) do so by translating them into micro-ops the same way current x86 processors do, and 3) trap some instructions to "millicode", which is z/Architecture machine code with some processor-dependent special instructions and access to processor-dependent special registers (and, yes, I can hear the word PALcode being shouted in the background...). See, for example, "
A high-frequency custom CMOS S/390 microprocessor" (paywalled, but the abstract is free at that link, and mentions millicode) and "IBM zEnterprise 196 microprocessor and cache subsystem" (non-paywalled copy; mentions microoperations). I'm not sure those processors have any of what would normally be thought of as "microcode".
The midrange System/38 and older ("CISC") AS/400 machines also had an S/360-ish instruction set implemented in microcode. The compilers, however, generated code for an extremely CISCy processor - but that code wasn't interpreted, it was translated into the native instruction set by low-level OS code and executed.
For legal reasons, the people who wrote the low-level OS code (compiled into the native instruction set) worked for a hardware manager and wrote what was called "vertical microcode" (the microcode that implemented the native instruction set was called "horizontal microcode"). That way, IBM wouldn't have to provide that code to competitors, the way they had to make the IBM mainframe OSes available to plug-compatible manufacturers, as it's not software, it's internal microcode. See "Inside the AS/400" by one of the architects of S/38 and AS/400.
Current ("RISC") AS/400s^WeServer iSeries^W^WSystem i^WIBM Power Systems running IBM i are similar, but the internal machine language is PowerPC^WPower ISA (with some extensions such as tag bits and decimal-arithmetic assists, present, I think, in recent POWER microprocessors but not documented) rather than the old "IMPI" 360-ish instruction set.
The main differences between RISC and CISC, as I recall were lots of registers and the simplicity of the instruction set. Both the Intel and zSeries CISC instruction sets have lots of registers, though.
Depends on which version of the instruction set and your definition of "lots".
32-bit x86 had 8 registers (many x86 processors used register renaming, but they still had only 8 programmer-visible registers, and not all were as general as one might like), and they only went to 16 registers in x86-64. System/360 had 16 general-purpose registers (much more regular than x86, but that's not setting the bar all that high :-)), and that continues to z/Architecture, although the latest z/Architecture lets you do arithmetic separately on the upper 32 bits and lower 32 bits of a GPR, so for 32-bit and shorter arithmetic, you sort-of have 32 GPRs. z/Architecture also boosts the number of floating-point registers from 4 to 16.
Most RISC instruction sets had 31 or 32 GPRs (in the 31-GPR machines, one of them was always zero when fetched, and operations writing into it discarded the result); ARM had only 16 (one of which was the PC, so not really usable), but 64-bit ARM has 32.
So the main difference between RISC and CISC would be that you could - in theory - optimize "between" the CISC instructions if you coded RISC instead.
As I see it, differences between current RISC and CISC instruction sets are:
- RISC ISAs are load/store, meaning that arithmetic instructions are all register-to-register, whereas CISC ISAs have memory-to-register arithmetic. Splitting memory-to-register arithmetic does let you optimize "between" the memory reference and the arithmetic. However, breaking a memory-to-register arithmetic instruction into separate memory-reference and arithmetic microops lets the hardware do similar things.
- RISC ISAs generally have more registers; that's not an inherent RISC vs. CISC characteristic, however. A CISC ISA could have, for example, 32 GPRs; at the time S/360 was designed. you couldn't just throw transistors at the problem, and, on the lower end S/360's, the GPRs were stored in a small higher-speed core memory array, not in transistorized registers, and they may have thought that assembler-language programmers and compiler writers wouldn't have been able to make much use of 32 GPRs in any case.
- CISC ISAs may have individual "complex" instructions, such as procedure call instructions, string manipulation instructions, decimal arithmetic instructions, and various instructions and instruction set features to "close the semantic gap" between high-level languages and machine code, add extra forms of data protection, etc. - although the original procedure-call instructions in S/360 were pretty simple, BAL/BALR just putting the PC of the next instruction into a register and jumping to the target instruction, just as most RISC procedure-call instructions do. A lot of the really CISCy instruction sets may have been reactions to systems like S/360, viewing its instruction set as being far from CISCy enough, but that trend has largely died out.
Presumably somebody tried this, but didn't get benefits worth shouting about.
Actually, I think many compilers for RISC processors will schedule instructions in that fashion. Of course, modern RISC processors may reschedule instructions in hardware, just as modern CISC processors may reschedule microoperations in hardware ("out-of-order" processors).
Incidentally, the CISC instruction set of the more recent IBM z machines includes entire C stdlib functions such as strcpy in a single machine-language instruction.
...which is probably implemented as a fast trap to a millicode subroutine in the aforementioned z/Architecture-with-its-own-private-GPR-set-plus-maybe-some-processor-dependent-instructions machine language. The millicode routine doesn't, as far as I know, have to save or restore any GPRs, as it gets to use its own set, and probably runs from memory that's as fast as the level 1 instruction cache rather than from the I-cache, so it might reduce I-cache misses, and can transparently differ from processor to processor in case the best string-copy or string-translate or... instruction sequence differs from processor to processor. Of course, a RISC processor could add "fast call" and "fast return" instructions that do similar GPR-set switching, and have a bigger I-cache, and the OS could make processor-specific string copy routines available, so I don't know how much those instructions would buy you.