L4 and QNX are nice, but do you have an example of their use outside of the embedded space?
L4Linux, ie. Linux running on L4 as a guest, came out before Xen and paravirtualization was even a thing. The overhead of running on L4 was demonstrably lower than Xen. So you could run L4 on a desktop using Linux as a guest. Maybe you still can, though the Linux kernel is probably quite outdated now.
Context-switching overhead has always been the argument against microkernels
It's substantially better than hypervisors which are now everywhere.
So basically, despite their fancy message passing design, to get performance they have to lump everything together into gigantic monolithic applications, albeit running in userspace. Doesn't sound like a great proof-of-principle design to me.
QNX was and is typically deployed in embedded systems where resource constraints dominate. These are domains where you'd use something like the lwIP library embedded directly in your application to get a networking stack. These certainly aren't representative of desktop or server systems, which is presumably what you're asking about.
Furthermore, there's no question that achieving high performance in a decomposed design with lots of isolation boundaries is harder, particularly if you want to achieve security or other properties, which is where researchers mostly focus, but it was solved at least 12 years ago. If a final release wants to squeeze out that extra 2-5% of throughput, you can switch a compile-time option to link everything monolithically, but that doesn't mean you should design it monolithically by default.
Microkernel "performance issues" are largely a myth. The very first microkernels in the 80s had some issues due to their design, and simple profiling identified IPC as the problem. Liedtke then invented the L3 microkernel that solved this problem, and there has never since been an informed performance complaint against microkernels. This myth persists due to that initial impression and to developers looking at the structure of this system and simply saying, "well obviously this will be much slower". Not very scientific. Past research is why microkernel papers focus on IPC; it's just science in action.
Finally, the KeyKOS operating system was a high-security microkernel design that was widely deployed in commercial timesharing systems, and even early ATMs, back in the 70s and 80s. It was proprietary and unpublished until later, and included hard disk drivers in the kernel because its core design included orthogonal persistence and the verifiable security properties depended on an audited disk driver. Other than that it was a legit microkernel and hosted an optional full POSIX guest.