I actually use C++ for embedded programming, because when used with care, it can actually do a better job than C for a number of things. I use template meta-programming to compute various things at compile time, such as, say, register initialization values and what not. Sure, I can do the same with #define and a boat load of macros, but that has its own issues. Not only are macros messy in their own way, they don't provide a good way to sanity check your settings. With templates and types done right, I can actually get the compiler to sanity check my settings at compile time. I don't know how many times I've chased down a bug due to swapped macro parameters that could have been caught compile-time with some type checking / trait checking.
I've written an entire C++ based support library just for this purpose. One of its goals is extreme compactness and cycle efficiency, since the code often needs to run in RTL simulation. Software RTL simulation of a large SoC runs in the 10s to 1000s of cycles per second, so cycle efficiency is at an extreme premium.
What my library largely replaces is other C and assembly code that (often hamfistedly) computes everything at run time, and so my code can handily beat that.
I haven't quite hit the nirvana of generating an entire MMU page tree from a compact memory map description using templates (I have a perl script for that), but it sure beats 100,000s cycles or more computing it at run time when that translates to hours of sim time. (Fun fact: Some rather popular modern processors run really slow until you turn the MMU on, because they can't cache any data until you do.)
I have however written dynamic code generators that use templates and function overloading to resolve as much of the opcode encoding as possible at compile time, so that the run-time portion usually is just a "store constant" or maybe a quick field insert into a constant followed by a store. Those can pump opcodes to memory as fast as an opcode per cycle (and in some special cases, faster), which is pretty darn good. Again, all typechecked as much as possible at compile time, to minimize or eliminate the possibility I generate invalid instructions.