> Every instruction is 32 bits long, clogging one's instruction bandwidth.
ARM Cortex-M processors use 16-bit instructions (Thumb and Thumb-2). They've had a while to optimise the instruction set for embedded and SoC.
Yes. Thumb. A major mode switch to use a smaller instruction. I've integrated a few ARMs into chips (first the ARM7TMDI) and they were pretty much a nightmare to bring into line with normal OS practices. 15 years later, everyone seems to think this cranky instruction set and system model is normal, because it's what they grew up with. Yet the funky interrupt model, the funky mode switching, the lack of standard device discovery (that Linus Torvalds complained about) and bandwidth hungry instructions do not stand together as an example of a great CPU architecture, just a successful one.
The 68000 series didn't keep up, by it was an order of magnitude easier to work with. Especially the microcontroller variants. That stuff matters when you are building products. Atmel do some nice CPUs for the low end that are a pleasure to use.
ARM got their position lodged in our phones by being willing to sell their CPU core at a time in the early 90s when few others would. Back then we were crying out for a PC-on-a-chip, so we could develop a phone radio on a PC card run the software on the PC, then just port directly to the same machine on an ASIC with a PC+phone logic. But the answer was no. GSM back then was heavy lifting. No amount of money would get you access to that core on your ASIC. Meanwhile, ARM was there in Cambridge, ready and willing to take your cheque. If ARC had been a bit quicker, they would have been the incumbent. The ARC CPU certainly was much better (faster, easier, smaller) than the ARM at the time. They sold soft macros too, compared to ARM's hard macros with nightmarish memory bus timing.