Tanenbaum-Torvalds Microkernel Debate Continues 534
twasserman writes "Andy Tanenbaum's recent article in the May 2006 issue of IEEE Computer restarted the longstanding Slashdot discussion about microkernels. He has posted a message on his website that responds to the various comments, describes numerous microkernel operating systems, including Minix3, and addresses his goal of building highly reliable, self-healing operating systems."
Re:Andy Tanenbaum ? (Score:3, Interesting)
Page based sockets? (Score:5, Interesting)
As I understand it, as a novice, the only way to communincate or syncronize data is via copies of data passed via something analogous to a socket. A Socket is a serial interface. If you think about this for a moment, you realize this could be thought of as one byte of shared memory. Thus a copy operation is in effect the iteration of this one byte over the data to share. At any one moment you can only syncronize that one byte.
But this suggests it's own solution. Why not share pages of memory in parallel between processes. This is short of full access to all of the state of another process. But it would allow locking and syncronization processes on entire system states and the rapid passing of data without copies.
Then it would seem like the isolation of mickrokernels would be fully gained without the complications that arrise in multi processing, or compartmentalization.
Or is there a bigger picture I'm missing.
Whatever... (Score:3, Interesting)
Will hardware drivers be developed faster and more reliably with a microkernel? That seems to be the biggest hurdle in reliable OS development these days... Anyone have a good answer for that, I honestly don't know.
Debate on. Microkernels will win in the end. (Score:2, Interesting)
In the CPU cycle flush 00s the debate is just different. Less code running at ring0 means less code that can cause a kernel panic, blue screen or whatever they call it in OSX.
A significant part of the market is OK running Java. The comparitivly small performance cost and high stability payoff of a microkernel makes the tradeoff a no-brainer.
Re:Andy Tanenbaum ? (Score:3, Interesting)
Linus has written the Linux kernel used in millions of computers ranging from PCs to Mainframe.
Tanenbaum still has Minix and doctorate.
Education means nothing if you do nothing with it. Linus has applied his education very well and progress well beyond anything Tanenbaum has accomplished, with or without a doctorate...
Re:Still Debating (Score:3, Interesting)
* QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.
* Hurd. After twenty years we're still waiting for a halfway stable release. Hurd development is almost an argument *for* monolithic kernels!
*Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.
* Mach. People claim OSX is a microkernel since it is built on top of Mach. But that ignores the real world fact that OSX is monolithic. People have been misled by the name.
* NT. This is NOT a microkernel! You don't believe anything else Microsoft says, so why do you believe this fairy tale?
In short, QNX is the only successful real world Microkernel. Linus happens to be right on this one: microkernels add too much complexity to the software. From ten thousand feet the high level architecture looks simple and elegant, but the low level implementation is a fraught with difficulties and hidden pitfalls.
Re:To Interject for a moment (Score:3, Interesting)
(Yes, I know it's an ugly hack. But it means I don't worry about giving Bash 120mb, and cc some enormous number...)
Re:Page based sockets? (Score:2, Interesting)
I would suggest that this will eventually make its way into kernel systems (just like any other good idea that has come from the programming language fields).
Re:Page based sockets? (Score:4, Interesting)
This is precisely what shared memory is, and it's used all over the place, in Unix and Windows both. When using it, you are of course back to shared data structures and all of the synchronization nastiness, but a) sometimes it's worth paying the complexity price, and b) sometimes it doesn't actually matter if concurrent access corrupts the data if something else is going to correct it (think packet collisions).
Still, if you have two processes that both legitimately need to read and write the same data, you probably need three processes. The communication overhead with the third process is usually pretty negligible.
There's even more exotic concurrency mechanisms that exist that don't require copying or even explicit synchronization, but they're usually functional in nature, and incompatible with the side-effectful state machines of most OS's and applications in existence today.
"Cars don't have reset buttons." My Prius does... (Score:5, Interesting)
It was apparently due to a firmware bug.
In any case, when it happened, according to personal reports in Prius forums from owners to whom it happened, the result was loss of internal-combustion-engine power, meaning they had about of mile of electric-powered travel to get to a safe stopping location. At that point, if you reset the computer by cycling the "power" button three times, most of the warning lights would go off, and the car would be fine again. Of course many to whom this happened didn't know the three-push trick... and those to whom it did happen usually elected to drive to the nearest Toyota dealer for a "TSB" ("technical service bulletin" = firmware patch).
These days, conventional-technology cars have a lot of firmware in them, and I'll bet they have a "reset" function available, even if it's not on the dashboard and visible to the driver.
Re:Page based sockets? (Score:3, Interesting)
Some other process could not butt-in on this channel however, since it's not registered to that socket.
Or is that how shared memory works?
Tnnebaum's point is that he can have a re-incarnation server that can recreate a stalled process. Thus by using exclusively message based communication he can assure that this won't disrupt the state of the operating system (because it's all isolated). The problem is when two processes need to share lots of data quickly. THen message passing gets in your way.
Something that had the negotiated nature of a socket, yet allowed two processes to syncronize their large data structures without passing the entire structure serially would be ideal. then you could still potenitally have things like a re-incarnation server. A new process would simply have to re-negotiate the socket.
I should be working... (Score:5, Interesting)
Now in theory I could see a high-availability microkernel being a good, less expensive alternative, to a classic mainframe environment, especially if you had a well written auto-healing system built in as a default. But that would require a lot of work outside the kernel that just isn't being done right now. And until it is, micro-kernels don't have anything more to offer than monolithic kernerls.
To put it in API terms - it doesn't matter very much whether your library correctly returns an error code for every possible circumstance, when most user level code doesn't bother to check it (or just exits immediately on even addressable errors).
Examples prove Linus' point (Score:3, Interesting)
Except for QNX the software he cites are either vaporware (Coyotos, HURD), esoteric research toys (L4Linux, Singularity), or brutally violate the microkernel concept (MacOSX, Symbian).
Even his best example, QNX is a very niche product and hard to compare to something like Linux.
Re:Still Debating (Score:5, Interesting)
*Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.
Actually, it's a start of a full-up Microkernel operating system. This isn't your grand-pappy's Minix, it's a brand new code base under the BSD license, intended to be developed out into a complete system. It's still taking baby-steps at the moment, but it's coming along quite nicely.
* NT. This is NOT a microkernel!
NT is a hybrid. It has Microkernel facilities that are constantly being used for something different in each version. Early versions of NT were apparently full Microkernels, but this was changed for performance.
* QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.
I would hardly call QNX a "niche" product. Running on everything from your car engine to Kiosk PCs (yes, that stupid iOpener ran it too), it's an extremely powerful and versatile operating system. Its Microkernel architecture even gives it the ability to be heavily customized for the needs of the application. Don't need networking? So don't run the server! Need a GUI? Just add the Graphics server to the startup.
Microkernels haven't failed. However, you may notice that nearly all the popular Operating Systems we use today were all developed back in the late 80's and early 90's. The real problem is that there hasn't been a need to develop new OSes until now. Now that Security and Stability are more difficult pressing issues than performance, we can go back to the drawing board and start designing new OSes to meet our needs for the next decade and a half.
A CPU like Kernel (Score:3, Interesting)
Sure you can recompile and all that jaz, but I'd love to see a day where an app could run on any number of kernels out there. This creates real competetion.
What I'd like to see if a kernel more like a CPU. Instead of linking your kernel calls, you place them as if you where placing an Assembly call. Then we can have many companies and open source organizations writing versions of it.
As we move towards multi core cpus this could really lead to performance leads. Where one or more of many cores could be dedicated to the kernel operations listening for operations and taking care of them. No context switches needed, no privledge mode switching.
Drivers and everything else run outside of kernel mode and use low level microcode to execute the code.
The best part I think is you could make it backword compatiable as we re-write. A layer could handle old kernel calls and change them to the micro codes.
As we define everything more and more then we might even be able to design CPUs that can handle it better.
Re:Whatever... (Score:3, Interesting)
I noticed two things about Tannenbaum's piece though. Essentially all of the microkernels he listed were either used in dedicated (including embedded) systems or were not true microkernels by his own admission. Or, like HURD, they were not commonly used in production. I would add Unicos/mk to the dedicated systems category because it is designed to run a single process efficiently across a large number of processes.
So here we have it. Microkernels are *far* better in some environments where the computer has a single (and often well defined) task esp. on well defined hardware but for things like mainframes or network servers, they are not commonly used. I can only suggest that the tradeoffs are more subtle than are commonly discussed here.
Re:To Interject for a moment (Score:1, Interesting)
I think I'd prefer the Linux network stack, which AFAIK simply doesn't crash in the first place.
Well... this is a more interesting comment than first reading might suggest. I've always been a bit dubious of "tolerant" software. It might sound counter-intuitive, but I'd rather have libraries/kernels terminate the running program and output a big message saying why rather than tolerate problems and try to continue. In the long-term it pays-off.
A lot of problems in Windows come from Microsoft trying to build Windows to be extremely tolerant of crap software and bizarre library calls, and to keep running as long as possible... and that has come back to bite them. Lots of strange failures that never get fixed because they don't terminate the program, they just end up generating shite later.
I prefer my libraries and kernels to just say... wrong. Fuck off. Or at the very least spazz out and crash spectacularly -- because those sorts of problems GET FIXED QUICKLY. It sounds like a hacky and roundabout idea... but it does work. It forces software to be fixed instead of being tolerant of its faults. Perhaps if you were in an academic environment and in total control of the entire software stack, the pure platonic ideal of development would work.
But you aren't, and it doesn't. Crashes and terminated programs get noticed and the problems fixed. It's real life coding in a nutshell.
Linux Will Stay "Mono" 'Til Linus Can't Take It (Score:3, Interesting)
Years later, Tanenbaum still makes valid observations, Linus and others continue to make a rather larger project jump through the hoops, and that's fine. The results of academic research may or may not get traction outside of a university, but without the research, there wouldn't be alternatives to contemplate. If I've gathered nothing else about Linus' personality from his writings over the years, it's that he seems to be practical, not particularly hung up on architectural (or licensing) theories... unlike me.
At some point, if his current architecture just isn't doing it for him any more, he might morph into Tanenbaum's 'A' student. It won't be because a microkernel was always right, but that it was right now.
The Big Picture (Score:2, Interesting)
The start and end of the series hits it home:
The dependence on technology, and the use of technology to get us out of trouble created by technology.
About how one technology is so interdependent on other technology, that one failure can cause the whole technology section to fail.
Initial reference to electricity grid failures, usual caused by one small part failing)
Wouldn't building a system so that one part cannot take down all the other parts be important?
How would the Internet be if the entire thing would fail if one part stopped?
Yes, I want a computer without a reset button...
Re:Page based sockets? (Score:3, Interesting)
Re:Page based sockets? (Score:3, Interesting)
Lack of threading is a benefit. (Score:2, Interesting)
Threading is the spawn of the Solaris. Oops, I mean the devil. Forking was so slow on Solaris that they had to invent threading to have multiple contexts run at any speed. The whole point behind a microkernel is to HIDE information. Threading EXPOSES information between separately running processes, so you need to have mutexes, semaphores, and all of that synchronization crap that makes for buggy code.
Threading is bad. Don't use it. When you have to use code that uses it, refactor the code to use processes or a state machine. It can be done. Don't whine. But don't use threads either.
Micro/Macro - how about run time modifiable (Score:3, Interesting)
The point that Andy makes which I agree on, is that computer software is still in its infancy. The part I disagree with is that it'll change by him stating the obvious.
Re:The truth about microkernels (Score:3, Interesting)
I/O channels would help IBM mainframe channels, which have an MMU between the peripheral and main memory...
I've heard from a friend at Intel that their new chipsets which fully support TCPA have this feature. So maybe trusted computing isn't just about copy prevention.
Re:Let's see... (Score:3, Interesting)
And Minix 3 doesn't count for real world use. It may be a good starting point for one, and maybe Minix 3.5 might do it. But as of today it would be silly to put Minix on anything but a hobbyist system.
history in the making (Score:4, Interesting)