Tanenbaum-Torvalds Microkernel Debate Continues 534
twasserman writes "Andy Tanenbaum's recent article in the May 2006 issue of IEEE Computer restarted the longstanding Slashdot discussion about microkernels. He has posted a message on his website that responds to the various comments, describes numerous microkernel operating systems, including Minix3, and addresses his goal of building highly reliable, self-healing operating systems."
To Interject for a moment (Score:5, Informative)
First and foremost, does anyone have a torrent of Minix3? Tanenbaum is a bit worried [google.com] about getting slashdotted. If you've got one seeded, please share.
Now with that out of the way. I don't know if anyone else has tried it yet, but Minix3 is kind of neat. It's a complete OS that implements the Microkernel concepts that he's been expounding on for years now. The upsides are that it supports POSIX standards (mostly), can run X-Windows, and is a useful development platform. Everything is very open, and still simple enough to trudge through without getting confused by the myriads of "gotchas" most OS code-bases contain. Unfortunately, it's still a long way from a usable OS.
The biggest issue is that the system is lacking proper memory management. It currently uses static data segments which have to be predefined before the program is run. If the program goes over its data segment, it will start failing on mallocs. The result is that you often have to massively increase the data segment just to handle the peak usage. Right now I have BASH running with a segment size of about 80 megs just so I can run configure scripts. That means that every instance of BASH is taking up that much memory! There's apparently a Virtual Memory system in progress to help solve this issue, so this is (thankfully) a temporary problem.
The other big issue is a lack of threading support. I'm trying to compile GNU PThreads [gnu.org] to cover over this deficiency, but it's been a slow process. (It keeps failing on the mctx stack configuration. I wish I understood what that was so I wouldn't have to blindly try different settings.)
On the other hand, the usermode servers do work as advertised. For example, the network stack occasionally crashes under VMWare. (I'm guessing it's the same memory problems I mentioned earlier.) Simply killing and restarting dhcpd actually does get the system back up and running. It's kind of neat, even though it does take some getting used to.
All in all, I think it's a really cool project that could go places. The key thing is that it needs attention from programmers with both the desire and time to help. Tossing lame criticisms won't help the project reach that goal. So if you're looking to help out a cool operating system that's focused on stability, security, and ease of development, come check out Minix for a bit. The worst that could happen is that you'll decide that it isn't worth investing the time and energy. And who knows? With some work, Minix might turn out to be a good alternative to QNX.
Re:Andy Tanenbaum ? (Score:3, Informative)
Read Tanenbaum's Wikipedia bio [wikipedia.org].
Go check the article out. (Score:5, Informative)
If you have a computer science degree you have probably used at least one if not more of his textbooks. He's one of the more prominent computer science researchers of the last couple decades.
Re:To Interject for a moment (Score:3, Informative)
Re:Still Debating (Score:2, Informative)
Minix 3 screenshots (Score:5, Informative)
I almost died of boredom looking for them. Here's the link, for the lazy:
http://www.minix3.org/doc/screenies.html [minix3.org]
Re:Andy Tanenbaum ? (Score:3, Informative)
No, actually he created two that I know of. Well, technically three since MINIX 3 is probably sufficiently different from MINIX 1 to be thought of as a different kernel. Amoeba was another microkernel-based OS designed to run on distributed systems, presenting an entire cluster as a single machine.
MINIX 1 was a teaching tool. MINIX 3 is a real OS, although still very young (less than two years old), but doing very well. Amoeba is so far ahead of Linux conceptually that they don't even belong in the same category.
Hybrids are a first generation device... (Score:3, Informative)
And even if you lumped them into cars, so, you have what, a few hundred prius's that have reset buttons, among the hundreds of millions of cars. And every computer in existance still has a reset button, and at some point in time that reset button has been exercised.
Re:Page based sockets? (Score:2, Informative)
But it's not free-form shared memory. One process strill has to request and or accept a socket from another. They are isolated. The other process only gets the page pointer when the first one sends it.
Thus not every processes can write to every bit of shared memory because they all live in seperate process spaces, not just a kernel space.
If a process dies, a re-incarnation server can restart it and it can once again send a message requesting the page of memory. It's still message based to request or send pages. Thus it maintains the robust, isolated, message based nature of a micro kernel in many ways.
Or is this how shared kernel memeory works now in Linux?
Try QNX. (Score:2, Informative)
Performance Hit of uKs unacceptable for most users (Score:3, Informative)
Okay, I spent 2 years working as a engineer in the OSF's Research Institute developing Mach 3.0 from 1991. Let me answer Linus's question in a simple fashion. What Mach 3.0 bought you over Mach 2.5 or Mach 2.0 was a 12% performance hit as every call to the OS had to make a User Space -> Kernel -> User Space hit. This was true on x86, Moto and any other processor architecture available to us at the time. Not one of our customers found this an acceptable price to pay and I very much doubt they would today. One of the reasons Microsoft moved a lot of functionality into the Kernel between NT 3.5 and NT4.0 was performances (NT being, at its origins a uK based OS).
What of the advantages ?
Is porting easier? No not really, the machine dependent code in Mach 2.5 and Mach 3.0 was already well abstracted.
You could run two OS personalities at once, for example you could have an Apple OS and Unix running at the same time. But why would any real world clients want to do this?
Problems in the OS personality wouldn't bring down the uKernel - but they might stop you doing any useful work while you reboot the OS personality.
Other things like distributed operating systems (and associated fault tolerance) were perhaps aided by the uK design and this is a path that, in my humble opinion, the OSF should have pursued with greater zeal than they did. Back in 1991 we had a Mach 3.0 based system that ran a uK across an array of x86 nodes but had different parts of the OS - say IO or memory management running on different nodes. From a user standpoint all the machines (in reality bog standard 386 machines linked by FDDI) looked like a single computer running a Unix like OS.
I remember discussing Linux with my colleagues back in 1993, some were impressed and thought the nascent OS model was very powerful, others dismissed it as a toy with no real future. I suspect Tannenbaum was also amongst the poo=pooers and has become pretty annoyed about how things have turned out.
Page based messages (Score:3, Informative)
Re:Minix is already on version 3 (Score:3, Informative)
That's odd. I could have sworn that I was just using an X-Terminal on it a few minutes ago.
Oh wait. I was using an X-Terminal. How in the world did that happen? </mock-sarcasm>
To be fair, getting X-Windows running is a recent development. On the other hand, the entire Minix3 codebase is a recent development. (Only a half-year old.) They're moving at a pretty good clip for a brand-new OS.
Re:All I want to know... (Score:3, Informative)
Minix will need some more features though, my guess is paging and threading are the major sticking points. Probably more system calls too but VM and threading are more work.
Being able to 'leverage' the enormous existing amount of software once Minix matures a bit would let Minix 'leapfrog' its 'competition'.
Disclaimer: I am involved with the Minix project.
Re:Grandma's computer never crashes (Score:1, Informative)
Re:So when did we forget... (Score:5, Informative)
Re:Page based sockets? (Score:3, Informative)
The kernel is the arbiter of shared memory, sure, because that's how it works, by futzing with the VM mappings of processes using it. It's not available to every process in the system though, it still has to ask the kernel for access.
But "communication" over shared memory is exactly how it works -- the size of the channel is the size of the entire shm segment. You write as much data as you want to the shm segment, then notify the receivers by using some sort of synchronization primitive -- 99% of the time when using shm, you use a semaphore, another SysV IPC primitive.
The SysV IPC bag of tricks also contains message queues, but I rarely see those used -- their API is weird and asymmetrical, and they're probably never implemented all that well due to their relative disuse. A good implementation could easily do them as zero-copy. BeOS was big on message-passing, and I imagine it used shared buffers to prevent unnecessary copies.
Incidentally, sockets and pipes are only conceptually stream-oriented. In reality, you're dealing with the size of the entire buffer, and it's really quite reasonably fast. When you want durability of your IPC resource, you can use named pipes, though you still to handle recovery of data in the buffer if the producer crashes.
Re:Minix is already on version 3 (Score:3, Informative)
I'm thinking that's a ways down the road. If Minix could at least be viable for embedding into smaller, pre-configured devices, it could garner a lot more support in the device-driver arena.
And it won't even get as far as BSD unless it has a BSD-like license.
Sorry? Minix3 is distributed under the BSD license [minix3.org].
Any word on a Xen compatibility?
Apparently it's up and running [google.com].
Alternative: write the OS in Java (Score:2, Informative)
Once you have decided to have each module keep its grubby little paws off other modules' data structures, the next logical step is to put each one in a different address space to have the MMU hardware enforce this rule.
You only need to do this if you're writing both kernel and application code in a language like C that allows arbitrary access to the entire address space. But imagine if everything was written in something like Java that doesn't have pointers. You might not even need a "kernel" as such, everything could run in supervisor mode - the protection would be provided by the language, not by MMU hardware.
In case you think this is all pie in the sky, check out JNode [jnode.org] which is an OS written in Java.
The truth about microkernels (Score:5, Informative)
So that's what you need to know about microkernels.
Re:Whatever... (Score:2, Informative)
"A special process, called the reincarnation server, periodically pings each device driver. If the driver dies or fails to respond correctly to pings, the reincarnation server automatically replaces it by a fresh copy. The detection and replacement of nonfunctioning drivers is automatic, without any user action required. This feature does not work for disk drivers at present, but in the next release the system will be able to recover even disk drivers, which will be shadowed in RAM. Driver recovery does not affect running processes."
Sure, if the actual disk is broken this will not help, but for all the cases where the driver rather than the hardware is at fault this would be a godsend.
Re:Still Debating (Score:4, Informative)
No-no-no-no-NO! I swear this kills me... Why does this myth continue to propogate? The ONLY thing about NT that was EVER uKernelish was that it did alot of IPC (message passing) and that it implemented "personalities" (but it did so in a most decidedly non-microkernel way). Both of these traits were commonly associated with microKernels at the time, but regardless the things that ACTUALLY make a kernel a microKernel never existed in NT... EVER...
Re:Still Debating (Score:3, Informative)
4. Network stacks, at least up to the transport layer, are implemented in kernel space.
Mirror set up (Score:3, Informative)
http://mirrors.easynews.com/minix3 [easynews.com]
Re:Page based sockets? (Score:3, Informative)
This is my opinion, but I had to say it: I personally don't like SysV. There are various ways to synchronize, and each method has advantages and disadvantages, but SysV is at the bottom of the pack if you ask me.
process-shared pthread mutexes and conditions are much faster than SysV, because they usually don't make a system call. A disadvantage of the SysV ipc that process-shared pthread mutexes have too is demonstrated by the existance of a program called 'ipcrm': "to delete resources that were left by irresponsible processes or process crashes": When a process crashes (or gets a kill -9, or a similar rude interruption), it may keep a lock causing the other contenders to wait forever until somebody cleans up the mess.
That is why fcntl()-based locks can be the right thing for your app, even though it's a syscall every time too like SysV. fcntl()-based locks are cleared as soon as you exit or close the filedescriptor (you can use the same fd as the one you used to mmap() the shared memory). That last behaviour, in turn, is a disadvantage of fcntl locks, because you lose the lock too if you close another filedescriptor referring to the same file (grunt)... There is also the flock(), which can hold only one lock per file (and I'm not sure, may have the same limitation as the fcntl() locks).
None of these are 'perfect', but SysV has never been my choice yet... I have good hopes for the new 'robust futexes' in the very lastest 2.6 kernels, which should combine the flexibility and efficiency of pthread mutexes with the robustness of flock()...
Oh, and yes: you can place a pthread mutex or condition in shared memory between processes (pthread_ZZZattr_setpshared(attr, PTHREAD_PROCESS_SHARED), it's not very well documented but it has been in the nptl for a while now and it works great).
Re:To Interject for a moment (Score:3, Informative)
Pardon me? Sir? Sir? You seem to have diarrhea of the mouth and constipation of the brain.
Minix 1 & 2 codebases are indeed older than Linux. And they could have *been* Linux, except that Tanenbaum was focused on teaching. As a result, he rejected the requests to add features, thus leading to the development of Linux.
However, he has apparently decided that it's time to start a microkernel project to prove the concept in the modern world. As such, he's written a brand new codebase that's focused on the Microkernel architecture rather than producing something that is easily studied by students. As such, the Minix3 codebase is only 8 months old.
If you'd taken the 5 minutes necessary to RTFA, you'd know that. In the future, please take the time to read before expounding on your obvious disgust with a concept you are not paying any attention to.
Thank you, and please have a nice day.