Microkernel: The Comeback? 722
bariswheel writes "In a paper co-authored by the Microkernel Maestro Andrew Tanenbaum, the fragility of modern kernels are addressed: "Current operating systems have two characteristics that make them unreliable and insecure: They are huge and they have very poor fault isolation. The Linux kernel has more than 2.5 million lines of code; the Windows XP kernel is more than twice as large." Consider this analogy: "Modern ships have multiple compartments within the hull; if one compartment springs a leak, only that one is flooded, not the entire hull. Current operating systems are like ships before compartmentalization was invented: Every leak can sink the ship." Clearly one argument here is security and reliability has surpassed performance in terms of priorities. Let's see if our good friend Linus chimes in here; hopefully we'll have ourselves another friendly conversation."
Re:Eh hem. (Score:3, Informative)
Oh Dear (Score:1, Informative)
http://people.fluidsignal.com/~luferbu/misc/Linus
We've got Andy Tanenbaum coming up with nothing practical in the fifteen or sixteen years he's been promoting microkernels, and then turning around and telling us he was right all along. Meanwhile, the performance of OS X sucks like a Hoover, as we all knew:
http://sekhon.berkeley.edu/macosx/intel.html [berkeley.edu]
I'll just pretend I didn't see this article.
Re:Theory Vs. Practice (Score:4, Informative)
Just to clear things up, my understanding is that Tanenbaum is advocating moving the complexity out of kernel space to user space (such as drivers). So you wouldn't be lowering the size/complexity of the kernel altogether, you'd just be moving huge portions of it to a place where it can't do as much damage to the system. Then the kernel just becomes one big manager which tells the OS what it's allowed to do and how.
- shazow
Re:Feh. (Score:1, Informative)
Re:NT4 (Score:1, Informative)
The "cleanest" NT versions were NT 3.1, 3.5 and 3.51.
Re:Feh. (Score:3, Informative)
O RLY [anandtech.com]
Re:How hard... (Score:3, Informative)
Re:Proof is in the pudding (Score:2, Informative)
"The proof is in the pudding" is actually an INCORRECT usage of the actual saying. If you think about it, it makes no sense -- what proof is in the pudding (and what pudding?!)?
Instead, the actual, proper phrase is: "The proof of the pudding is in the eating."
QNX ! (Score:5, Informative)
QNX [qnx.com], but it isn't open source.
VxWorks [windriver.com] and a few other would also fit.
Mklinux (Score:3, Informative)
QNX for teh win :) (Score:4, Informative)
Unlike certain other OS's, QNX is used in control applications with life and death implications. (nuclear reactors and medical equipment for example)
QNX has been through a lot of changes since then. And I have not kept up with most of them. I do know that as of a few years ago they did make a "free for personal use" release that included their development system. And a few years before that, they had a 1.44meg demo disk that had their entire OS, GUI and web browser on it.
But don't take my word for it go check out their website.
Re:NT4 (Score:3, Informative)
This, as people can imagine, complicates everything(in a particular part of NT that is already complicated enough and not terribly well documented).
Re:Theory Vs. Practice (Score:2, Informative)
Re:Eh hem. (Score:4, Informative)
No, it's compartmentalization of the applications. Besides, the analogy is really bad because a ship with a blown compartment is quite useful. Computers with a blown network driver will e.g. break any network connections going on, in other words a massive failure. What about a hard disk controller which crashes while data is being written? Drivers should not crash, period. Trying to make a system that could survive driver failure will just lead to kernel bloat with recovery code.
Re:multicompartment isolation (Score:3, Informative)
Ive coded under QNX a lot and would stronghly disagree with your view on the message passing overhead. From this QNX [wikipedia.org] page.
QNX interprocess communication consists of sending a message from one process to another and waiting for a reply. This is a single operation, called MsgSend. The message is copied, by the kernel, from the address space of the sending process to that of the receiving process. If the receiving process is waiting for the message, control of the CPU is transferred at the same time, without a pass through the CPU scheduler. Thus, sending a message to another process and waiting for a reply does not result in "losing one's turn" for the CPU. This tight integration between message passing and CPU scheduling is one of the key mechanisms that makes QNX message passing broadly usable. Most UNIX and Linux interprocess communication mechanisms lack this tight integration, although an implementation of QNX-type messaging for Linux does exist. Mishandling of this subtle issue is a primary reason for the disappointing performance of some other microkernel systems.
Tanenbaum is wrong, and should know it (Score:5, Informative)
Kernels don't often crash for reasons related to lack of memory protection. It's quite silly to imagine that memory protection is some magic bullet. Kernel programmers rarely make beginner mistakes like buffer overflows.
Kernels crash from race conditions and deadlocks. Microkernels only make these problems worse. The interaction between "simple" microkernel components gets horribly complex. It's truly mind-bending for microkernel designs that are more interesting than a toy OS like Minux.
Kernels also crash from drivers causing the hardware to do Very Bad Things. The USB driver can DMA a mouse packet right over the scheduler code or page tables, and there isn't a damn thing that memory protection can do about it. CRASH, big time. A driver can put a device into some weird state where it locks up the PCI bus. Say bye-bye to all devices on the bus. A driver can cause a "screaming interrupt", which is where an interrupt is left demanding attention and never gets turned off. That eats 100% CPU. If the motherboard provides a way to stop this, great, but then any device sharing the same interrupt will be dead.
I note that Tanenbaum is trying to sell books. Hmmm. He knows his audience well too: those who can't do, teach. In academia, cute theories win over the ugly truths of the real world.
Re:The unsinkable Kernel (Score:3, Informative)
Hurd in Google's summer-of-code (Score:4, Informative)
Re:Feh. (Score:5, Informative)
Well, trust is placed in those user-land programs to perform the task for which they are responsible. Whereas in a monolithic kernel, trust is placed in each subsystem to not only perform the task it is responsible for, but also not to muck with the workings of every other subsystem in the kernel as they all reside in the same address space. Therefore in a microkernel you can have a bug in your network stack without compromising your file system driver or authentication module, while this isn't necessarily true in a macrokernel. Compartmentalization is very good for security.
Which is just one of the reasons Mach is so popular as a research OS, despite never seeing any success in the real world. Compartmentalization also makes the OS easier to maintain, easier to understand, and easier to make modifications for. Plus it's very easy to port to new hardware, if that's required.
In a sense, most OSes are "microkerneled" anyway. Most functionality is implemented by programs running on top of the kernel, which pass messages back and forth between themselves and the kernel. Perhaps my view on this is a little naive, but I don't see too much of a difference between a microkernel module and any other process on the machine.
I think you underestimate the things that are handled by the kernel? Unix uses many user-land services, but also has many services integrated into the kernel. Take the concept of moving functionality into user space to the limit, and you have a microkernel. Your last observation isn't naive, it's correct: a microkernel module isn't necessarily any different than any other process on your machine.
Re:A Good example? (Score:3, Informative)
How can the average user see this? When "Software Update" runs, almost any update to the system (not updates to an Apple application like iTunes) will require a restart of the whole machine. In a true Microkernel design you might need to relaunch the Finder or restart the communications architecture, but unless something changes kernel space code you wouldn't need to restart the whole computer.
The uptime command would give Apple proponents much more to brag about if it were a true microkernel, but beyond hardware abstraction I don't think Apple has the same needs for a microkernel architecture as others. Since that's the case, I don't think it's fair to hold it up as an example of the fatal sins of microkernels in general. Nor do I think dragging in your personal valuations of speed and stability are rigorous indictments of Mac OS X's performance either.
Minix (Score:3, Informative)
Guess what he told me. A revamped version of minix is coming.
Re:Proof is in the pudding (Score:2, Informative)
One thing which hasn't really been touched on in this thread yet is exactly what the modularity means for end users. It isn't just that the kernel is simplified. When the drivers and other subsystems are external to the kernel, their failure can become a non-issue. This *does not* just mean that they won't take the whole system down. Minix has a process called the ressurection server. It monitors all of the various drivers and subsystems and, if one of them fails, will attempt to get it running again. Professor Tanenbaum showed us data where the hard disk controller was failing a rediculous number of times per second, but the system was still maintaining about 90% of the speed that it would have if it wasn't failing at all. The OS design can actually make up for poorly designed drivers.
Re:Feh. (Score:5, Informative)
There were several flaws in their tests:
1. They used GCC 3.x compiler instead of GCC 4.x compiler shipping with Tiger because the linux distros they were comparing against had not updated to 4.x of GCC yet.
2. They did not include the OS X specific patches to alter the threading mechanism. This caused a significant performance hit as MySQL was written for the linux threading model rather than a Mach one or more generic model.
3. Binary builds with OS X specific patches were available for download via links from the official sites. There was no need to compile a crippled version.
4. They should have also tested the free/evaluation versions of Oracle as there are optimized version available for both linux and OS X. Assuming this was not a test of only OSS but rather performance as a "server", I do not see why they did not include it.
Re:Metaphors eh? (Score:4, Informative)
The goal is to have a system such that you maximize the segregation of the parts. If the SCSI subsystem crashes -- for example -- you flush it and restart it. While it may not be possible to totally isolate every subsystem, with a microkernel subsystems should be more robust than in monolithic kernels.
For all of Linus' scorn of microkernels, Linux borrows heavily from the concept, if not from the theory. One could almost say that Linux implements a microkernel poorly through the kernel module interface. It fails to be a true microkernel in a number of ways, though, not least of which is the low degree to which it isolates modules.
In any case, your nervousness about a system where a "fundamental" subsystem craps out is understandable in someone who's main experience is with monolithic kernels, because the corruption of one subsystem often infects other systems. For example, IME when the Linux SCSI module starts barfing (which happens with distressing regularity), if you're lucky, you can unload and reload the SCSI modules, but eventually, you're going to have to reboot, because it never quite works well after a reload. In a microkernel, subsystems are just services that other subsystems may use, but aren't intimate with. A corruption in one subsystem shouldn't lead to corruptions in any other subsystem.
--- SER
Re:Feh. (Score:3, Informative)
Where does NT implements all that? In kernel space. A NULL pointer in that code brings the system down.
Just because it was STARTED from a microkernel (like mac os x) doesn't means it's a REAL microkernel. How can you call "microkernel" to something that implements the filesystem in kernelspace?
Restarting drivers (Score:5, Informative)
Drivers have measurably more bugs in them than other parts of the kernel. This has been shown by many studies (see the third reference in the article). This can also been shown empirically - modern versions of Windows are often fine until a buggy driver gets on to them and destablises things. Drivers are so bad that XP even warns you about drivers that haven't been through checks. Saying people should be careful just doesn't cut it and is akin to saying people were more careful in the days of multitasking without protected memory. Maybe they were but some program errors slipped through anyway, bringing down the whole OS when I used AmigaOS (or Windows 95). These days, if my my web browser misbehaves at least it doesn't take my word processor with it, losing the web browser is pain enough.
In all probability you would know that a driver had to be restarted because there's a good chance its previous state had to be wiped away. However a driver that can be safely restarted is better than a driver that locks up everything that touches it (ever had an unkillable process stuck in the D state? That's probably due to a driver getting stuck). You might be even able to do a safe shutdown and lose less work. From a debugging point of view I prefer not having to reboot the machine to restart the entire kernel when driver goes south - it makes inspection of the problem easier.
(Just to prove that I do use Minix though I shall say that killing the network driver results in a kernel panic which is a bit of a shame. Apparently the state is too complex to recover from but perhaps this will be corrected in the future).
At the end of the day it would be better if people didn't make mistakes but since they do it is wise to take steps to mitigate the damage.
Re:How hard... (Score:3, Informative)
Also, in the early 90's Tenon Intersystems had a MacOS running on Mach that had some UNIXy stuff underneath as well.
Then, of course, there was mkLinux, known for being almost compatible with Linux, almost compatible with most hardware, and almost as fast as just running Linux.
All of them ran reasonably well, but neither really embraced the kernelized design from UI to drivers to hardware, and all were to varying degrees slower than the mainstream versions of their software.
Re:Restarting drivers (Score:3, Informative)
However, the driver certification program is to some extent a waste of time anyway:
I think a big part of the problem is that it really isn't worth the driver writer's development costs to make the drivers stable. There is often a rapid turnover of hardware so they need to keep revising the driver and so long as it's stable enough that the average user doesn't realise it's that driver that's to blame before the product is end-of-lifed then what benefit is it to the manufacturer to spend the extra cash to make the driver stable?
However a driver that can be safely restarted is better than a driver that locks up everything that touches it (ever had an unkillable process stuck in the D state? That's probably due to a driver getting stuck).
The D state _is_ a bug, and in many cases an example of lazy coding. It's the "oops something went wrong but we don't want to complicate our code by catching the error and cleaning up so lets put the machine into an unrecoverable state".
For example, if you pull a USB mass storage device while it's mounted (a very silly thing to do, but it really shouldn't break the machine) then all the processes that try to access it will probably drop into the D state. There is no good reason for this - the filesystem driver has asked the USB block device driver to read or write some data and the USB block device driver _knows_ full well that the device has gone away so it should return a failure which the filesystem driver and catch and (after cleaning up locks, etc.) can return to userland as a failed operation. Unfortunately, rather than catching this error gracefully, either the block driver or the filesystem driver just gives up and goes to sleep waiting for an i/o operation that will never complete.
In this case, the D state is no better than a user's application bombing out on an ASSERT() failure - something went wrong, we can't be bothered to even save the user's work to a recovery file, lets bomb out losing the lot - if that's not a bug I don't know what is. (Yes, I'm aware that data integrity can't be guaranteed in many cases but you should at least dump out the (potentially corrupt) data to a recovery file).
At the end of the day it would be better if people didn't make mistakes but since they do it is wise to take steps to mitigate the damage.
I think there is some truth in the "less risk increases lazyness" idea, but I do agree that mitigating the damage is more important than scaring coders away from lazyness.
Re:Cue the peanut gallery. (Score:5, Informative)
Do you actually want people to take you seriously when you post utter shit like this?
That is a veiled lie. Mach performed very poorly mostly because of message _validation_, not message passing (although it was pretty slow at that too). I.e. it spent alot of cycles making sure messages were correct. L3/L4 and K42 simple dont do any validation, they leave it up to the user code. In other words once you put back the validation in userland that Mach had in kernelspace, things are a bit more even. And for the love of god NT is NOT a microkernel. It never was a microkernel. And stop using the term "hybrid", all hybrid means is that the marketing dept. wanted people to think it was a microkernel...
Now I will throw a few "facts" at you. It is possible with alot of clever trickery to simulate message passing using zero-copy shared memory (this is what L3/L4/K42/QNX/etc... any microkernel wanting to do message passing quickly). And if done correctly it CAN perform in the same league as monolithic code for many things where the paradigm is a good fit. But there are ALWAYS situations where it is going to be desirable for seperate parts of an OS to directly touch the same memory in a cooperative manner, and when this is the case a microkernel just gets in your damn way...
Ok... Two things. OpenBSD is pretty much the slowest of all BSD derivitives (which is fine, those guys are more concerned with other aspects of the system and its users are as well), so using it in this comparison shows an obvious bias on your part... Secondly, and please listen very closely because this bullshit needs to stop already, !!OSX IS NOT A MICROKERNEL!! It is a monolithic kernel. Yes it is based on Mach, just like mkLinux was (which also was not a microkernel). Lets get something straight here, being based on Mach doesnt make your kernel a microkernel, it just makes it slow. If you compile away the message passing and implement your drivers in kernel space, then you DO NOT have a microkernel anymore.
So what you actually said in your post could be re-written like this:
Fact: OSX is sooooo slow that the only thing it is faster than is OpenBSD. And you cant even blame its slowness on it being a microkernel. How pathetic... Wow, that says it all in my book :)
And no, you dont have to believe me... Please read this [usenix.org] before bothering to reply.
Re:Feh. (Score:1, Informative)
The obvious one is to provide paging.
The other one is so that processes run in their own address spaces.
One of the other posters mentions the Microsoft research kernel written in C# (plus constraints). Apparently Tannenbaum does that as well. While this is an interesting system, it's also a SASOS (single address space OS). Which (in a real environment) has the major drawback that any process can DOS the whole system simply by (over) allocating memory. To prevent that you need processes to run in their own address space. If you do that you need some mecanism to transfer memory pages between processes (aka messages), and you have to make a context switch when you cross address spaces boundaries. And if you do that it doesn't matter that some of your systemn is in ring 0 and the rest in ring 3, you'll incur the cost anyway.
That's not to say that there's no merit in writing (most of) the OS in a higher level language and putting the JIT compiler in there as well, thus allowing code to be safely (-ish) uploaded in the kernel (with the associated speed benefits), but a) that's hardly a microkernel approach, b) you'still run the bulk of the code in user land, and c) it's most definetely not new. Look up exokernel.
Re:multicompartment isolation (Score:3, Informative)
Other better examples of microkernel (Score:2, Informative)