Forgot your password?
typodupeerror

Tanenbaum-Torvalds Microkernel Debate Continues 534

Posted by ScuttleMonkey
from the arguements-that-never-die dept.
twasserman writes "Andy Tanenbaum's recent article in the May 2006 issue of IEEE Computer restarted the longstanding Slashdot discussion about microkernels. He has posted a message on his website that responds to the various comments, describes numerous microkernel operating systems, including Minix3, and addresses his goal of building highly reliable, self-healing operating systems."
This discussion has been archived. No new comments can be posted.

Tanenbaum-Torvalds Microkernel Debate Continues

Comments Filter:
  • Since I know that this story is going to turn into flame-fest central, I'm going to try to head things off by interjecting an intelligent conversion about some issues that are on my mind at the moment.

    First and foremost, does anyone have a torrent of Minix3? Tanenbaum is a bit worried [google.com] about getting slashdotted. If you've got one seeded, please share.

    Now with that out of the way. I don't know if anyone else has tried it yet, but Minix3 is kind of neat. It's a complete OS that implements the Microkernel concepts that he's been expounding on for years now. The upsides are that it supports POSIX standards (mostly), can run X-Windows, and is a useful development platform. Everything is very open, and still simple enough to trudge through without getting confused by the myriads of "gotchas" most OS code-bases contain. Unfortunately, it's still a long way from a usable OS.

    The biggest issue is that the system is lacking proper memory management. It currently uses static data segments which have to be predefined before the program is run. If the program goes over its data segment, it will start failing on mallocs. The result is that you often have to massively increase the data segment just to handle the peak usage. Right now I have BASH running with a segment size of about 80 megs just so I can run configure scripts. That means that every instance of BASH is taking up that much memory! There's apparently a Virtual Memory system in progress to help solve this issue, so this is (thankfully) a temporary problem.

    The other big issue is a lack of threading support. I'm trying to compile GNU PThreads [gnu.org] to cover over this deficiency, but it's been a slow process. (It keeps failing on the mctx stack configuration. I wish I understood what that was so I wouldn't have to blindly try different settings.)

    On the other hand, the usermode servers do work as advertised. For example, the network stack occasionally crashes under VMWare. (I'm guessing it's the same memory problems I mentioned earlier.) Simply killing and restarting dhcpd actually does get the system back up and running. It's kind of neat, even though it does take some getting used to.

    All in all, I think it's a really cool project that could go places. The key thing is that it needs attention from programmers with both the desire and time to help. Tossing lame criticisms won't help the project reach that goal. So if you're looking to help out a cool operating system that's focused on stability, security, and ease of development, come check out Minix for a bit. The worst that could happen is that you'll decide that it isn't worth investing the time and energy. And who knows? With some work, Minix might turn out to be a good alternative to QNX. :-)
  • Re:Andy Tanenbaum ? (Score:3, Informative)

    by robla (4860) * on Monday May 15, 2006 @01:36PM (#15335870) Homepage Journal
    Somebody can enligth me about Andy Tanenbaum ?

    Read Tanenbaum's Wikipedia bio [wikipedia.org].
  • by rdunnell (313839) * on Monday May 15, 2006 @01:37PM (#15335881)
    He developed Minix along with tons of other research work in distributed systems, networks, and other computer science topics.

    If you have a computer science degree you have probably used at least one if not more of his textbooks. He's one of the more prominent computer science researchers of the last couple decades.

  • by bhirsch (785803) on Monday May 15, 2006 @01:45PM (#15335949) Homepage
    What's bugging me is that it is a mini-review of the OS and has nothing to do with monolithic vs. micro kernel debate.
  • Re:Still Debating (Score:2, Informative)

    by Respawner (607254) on Monday May 15, 2006 @01:46PM (#15335961)
    yeah, I wish we used microkernels today, mayby we could put it in OS X or something else nobody uses, oh wait ...
  • Minix 3 screenshots (Score:5, Informative)

    by mustafap (452510) on Monday May 15, 2006 @01:53PM (#15336002)

    I almost died of boredom looking for them. Here's the link, for the lazy:

    http://www.minix3.org/doc/screenies.html [minix3.org]
  • Re:Andy Tanenbaum ? (Score:3, Informative)

    by TheRaven64 (641858) on Monday May 15, 2006 @01:54PM (#15336015) Journal
    Did he ever create a running kernel ?

    No, actually he created two that I know of. Well, technically three since MINIX 3 is probably sufficiently different from MINIX 1 to be thought of as a different kernel. Amoeba was another microkernel-based OS designed to run on distributed systems, presenting an entire cluster as a single machine.

    MINIX 1 was a teaching tool. MINIX 3 is a real OS, although still very young (less than two years old), but doing very well. Amoeba is so far ahead of Linux conceptually that they don't even belong in the same category.

  • by everphilski (877346) on Monday May 15, 2006 @02:30PM (#15336355) Journal
    ... they are an exception to a "normal" car he was refering to.

    And even if you lumped them into cars, so, you have what, a few hundred prius's that have reset buttons, among the hundreds of millions of cars. And every computer in existance still has a reset button, and at some point in time that reset button has been exercised.
  • by goombah99 (560566) on Monday May 15, 2006 @02:32PM (#15336363)
    Since people seem to think I was talking about shared memory in my post, let me clarify what I mean be a page socket. What I mean is that for two processes to communicate over a normal serial socket they do it one byte at a time. With a page socket, one process could in effect send an entire page of memory to another. It would infact be sending simply a pointer to some page of memory. This pointer could be to a read-only or read-write part of memory enforced in the MMU. Thus sending a page of a data structure would be instantenous.

    But it's not free-form shared memory. One process strill has to request and or accept a socket from another. They are isolated. The other process only gets the page pointer when the first one sends it.

    Thus not every processes can write to every bit of shared memory because they all live in seperate process spaces, not just a kernel space.

    If a process dies, a re-incarnation server can restart it and it can once again send a message requesting the page of memory. It's still message based to request or send pages. Thus it maintains the robust, isolated, message based nature of a micro kernel in many ways.

    Or is this how shared kernel memeory works now in Linux?

  • Try QNX. (Score:2, Informative)

    by tetabiate (55848) on Monday May 15, 2006 @02:35PM (#15336397)
    I played a little with it and seems pretty fast and stable. A good set of GNU utils have already been ported and even commercial software like Intel's C++ compiler and the Opera browser. However, it is not ready to replace Linux as a desktop/server OS since it lacks a lot of applications/extensions like a good NFS client/server, journaled filesystems, etc. It is fast and realiable and has the potential to become a good desktop OS if someday the company decides to give it a chance out of the embedded and RTOS market.
  • by David Off (101038) on Monday May 15, 2006 @02:43PM (#15336470) Homepage
    > Virtually all of these postings have come from people who don't have a clue what a microkernel is or what one can do.

    Okay, I spent 2 years working as a engineer in the OSF's Research Institute developing Mach 3.0 from 1991. Let me answer Linus's question in a simple fashion. What Mach 3.0 bought you over Mach 2.5 or Mach 2.0 was a 12% performance hit as every call to the OS had to make a User Space -> Kernel -> User Space hit. This was true on x86, Moto and any other processor architecture available to us at the time. Not one of our customers found this an acceptable price to pay and I very much doubt they would today. One of the reasons Microsoft moved a lot of functionality into the Kernel between NT 3.5 and NT4.0 was performances (NT being, at its origins a uK based OS).

    What of the advantages ?

    Is porting easier? No not really, the machine dependent code in Mach 2.5 and Mach 3.0 was already well abstracted.

    You could run two OS personalities at once, for example you could have an Apple OS and Unix running at the same time. But why would any real world clients want to do this?

    Problems in the OS personality wouldn't bring down the uKernel - but they might stop you doing any useful work while you reboot the OS personality.

    Other things like distributed operating systems (and associated fault tolerance) were perhaps aided by the uK design and this is a path that, in my humble opinion, the OSF should have pursued with greater zeal than they did. Back in 1991 we had a Mach 3.0 based system that ran a uK across an array of x86 nodes but had different parts of the OS - say IO or memory management running on different nodes. From a user standpoint all the machines (in reality bog standard 386 machines linked by FDDI) looked like a single computer running a Unix like OS.

    I remember discussing Linux with my colleagues back in 1993, some were impressed and thought the nascent OS model was very powerful, others dismissed it as a toy with no real future. I suspect Tannenbaum was also amongst the poo=pooers and has become pretty annoyed about how things have turned out.
  • Page based messages (Score:3, Informative)

    by CustomDesigned (250089) on Monday May 15, 2006 @02:52PM (#15336544) Homepage Journal
    That is how messages have worked in Mach since its inception. The Microkernel would always send messages by page table manipulation whenever possible. Minix-1 did not work that way (for simplicity) - it just copied the bytes. Someone who has downloaded Minix-3 will have to tell us whether Minix-3 can send "page based" messages.
  • And v3.12 (I think, I'm going from memory here) will finally support the X windowing system

    That's odd. I could have sworn that I was just using an X-Terminal on it a few minutes ago.

    Oh wait. I was using an X-Terminal. How in the world did that happen? </mock-sarcasm>

    To be fair, getting X-Windows running is a recent development. On the other hand, the entire Minix3 codebase is a recent development. (Only a half-year old.) They're moving at a pretty good clip for a brand-new OS. :-)
  • by ezzzD55J (697465) <slashdot5@scum.org> on Monday May 15, 2006 @03:03PM (#15336641) Homepage
    Yes it is, and I think it is a very good idea.

    Minix will need some more features though, my guess is paging and threading are the major sticking points. Probably more system calls too but VM and threading are more work.

    Being able to 'leverage' the enormous existing amount of software once Minix matures a bit would let Minix 'leapfrog' its 'competition'.

    Disclaimer: I am involved with the Minix project.

  • by tygt (792974) on Monday May 15, 2006 @03:06PM (#15336661)
    And thus, she's not a user.
  • by qbwiz (87077) * <john@baumanfamily. c o m> on Monday May 15, 2006 @03:09PM (#15336679) Homepage
    Sure it has a monolithic kernel. It's just that it also has a microkernel, too.
  • by nuzak (959558) on Monday May 15, 2006 @03:10PM (#15336690) Journal
    > Perhaps I'm mistaken, but Isn't shared memory essentially available to the entire macro-kernel and all it's processes.

    The kernel is the arbiter of shared memory, sure, because that's how it works, by futzing with the VM mappings of processes using it. It's not available to every process in the system though, it still has to ask the kernel for access.

    But "communication" over shared memory is exactly how it works -- the size of the channel is the size of the entire shm segment. You write as much data as you want to the shm segment, then notify the receivers by using some sort of synchronization primitive -- 99% of the time when using shm, you use a semaphore, another SysV IPC primitive.

    The SysV IPC bag of tricks also contains message queues, but I rarely see those used -- their API is weird and asymmetrical, and they're probably never implemented all that well due to their relative disuse. A good implementation could easily do them as zero-copy. BeOS was big on message-passing, and I imagine it used shared buffers to prevent unnecessary copies.

    Incidentally, sockets and pipes are only conceptually stream-oriented. In reality, you're dealing with the size of the entire buffer, and it's really quite reasonably fast. When you want durability of your IPC resource, you can use named pipes, though you still to handle recovery of data in the buffer if the producer crashes.
  • Unless Minix finds an easy way to port Linux drivers, it won't go further on the desktop than BSD.

    I'm thinking that's a ways down the road. If Minix could at least be viable for embedding into smaller, pre-configured devices, it could garner a lot more support in the device-driver arena.

    And it won't even get as far as BSD unless it has a BSD-like license.

    Sorry? Minix3 is distributed under the BSD license [minix3.org].

    Any word on a Xen compatibility?

    Apparently it's up and running [google.com]. :-)
  • by Hugo Graffiti (95829) on Monday May 15, 2006 @04:05PM (#15337177)
    Seriously. Maybe not Java itself, but a kind of system level version of Java. Andy Tanenbaum says:

    Once you have decided to have each module keep its grubby little paws off other modules' data structures, the next logical step is to put each one in a different address space to have the MMU hardware enforce this rule.

    You only need to do this if you're writing both kernel and application code in a language like C that allows arbitrary access to the entire address space. But imagine if everything was written in something like Java that doesn't have pointers. You might not even need a "kernel" as such, everything could run in supervisor mode - the protection would be provided by the language, not by MMU hardware.

    In case you think this is all pie in the sky, check out JNode [jnode.org] which is an OS written in Java.

  • by Animats (122034) on Monday May 15, 2006 @04:06PM (#15337189) Homepage
    The real truth about microkernels is about like this:

    • Getting the architecture of a microkernel right is really hard. There are some very subtle issues in how interprocess communication interacts with scheduling. If these are botched, performance under load will be terrible. QNX got the performance part right. Mach got it wrong. Early Minix didn't address this issue. See this article in Wikipedia [wikipedia.org]. Other big issues include the mechanism by which one process finds another, and how mutually mistrustful processes interact. If you botch the basic design decisions, your microkernel will suck. Guaranteed.
    • Most academic attempts at microkernels have been duds. One can argue over why, but it's the commercial ones, like QNX, VM, and KeyKos that work well, while the academic ones, like Mach, EROS, and the Hurd have been disappointing.
    • Security models really matter. And they're hard. Multics got this right. KeyKos got this right. QNX is no better than UNIX in this area. Designers must work through "A can't do X, but A can trick B into doing X" issues.
    • Trying to turn a monolithic kernel into a microkernel doesn't work well. Mach, which started life as BSD UNIX, ran into this problem, which is why MacOS X isn't based on the microkernel version of Mach.
    • Drivers in user space have real advantages. Not only is the protection and restartability better, but because they have access to all the regular user program facilities, drivers for more modern devices are much easier. Things like Firewire and USB device discovery and hot-plugging reconfiguration are far easier at the user level, where you have threads, can block, and can call other programs. The old "top half and bottom half" driver approach doesn't generalize well to today's more dynamic configurations. Monolithic kernels have had to add kernel threads and dynamic loading of modules to handle all this, resulting in kernel bloat. Of course, a big advantage of less-privileged drivers is blame management - you can tell whether the OS or the driver is at fault.
    • Startup requires more attention. A microkernel often doesn't contain the drivers needed to get itself started. So the startup and booting process is more complex. QNX has a boot loader which loads the kernel and any desired set of programs as part of the boot image. This gets the console driver and disk driver in at startup, without having to make them part of the kernel.
    • The performance penalty is real, but not that big There's a performance penalty associated with the extra interprocess communication. It's usually not that big, but there are areas where it's a problem. If it takes one interprocess call for each graphics operation, for example, performance will be terrible. NT 3.51 had a nice solution to this problem, designed by Dave Cutler. (NT 4 and later have a more monolithic kernel, but that had to do more with making NT bug-compatible with Windows 95 than with performance problems.)
    • I/O channels would help IBM mainframe channels, which have an MMU between the peripheral and main memory, are better suited to a microkernel architecture than the "bus" model the microcomputer world uses. In the mainframe world, the kernel can put program in direct communication with the hardware without giving it the ability to write all over memory. So there's little penalty for having drivers in user space. Which is why VM for IBM mainframes has been doing this for decades.
    • If you get it right, the kernel doesn't change much over time. This is the big win, and why microkernels become more stable over time. In the QNX world, USB and Firewire support were added with no changes to the kernel. (I wrote a FireWire camera driver for QNX, so I'm sure of this.) The IBM VM kernel has changed little in decades.

    So that's what you need to know about microkernels.

  • Re:Whatever... (Score:2, Informative)

    by SmokedS (973779) on Monday May 15, 2006 @04:16PM (#15337324)
    I would most definitely care. From the minix site:

    "A special process, called the reincarnation server, periodically pings each device driver. If the driver dies or fails to respond correctly to pings, the reincarnation server automatically replaces it by a fresh copy. The detection and replacement of nonfunctioning drivers is automatic, without any user action required. This feature does not work for disk drivers at present, but in the next release the system will be able to recover even disk drivers, which will be shadowed in RAM. Driver recovery does not affect running processes."

    Sure, if the actual disk is broken this will not help, but for all the cases where the driver rather than the hardware is at fault this would be a godsend.

  • Re:Still Debating (Score:4, Informative)

    by galvanash (631838) on Monday May 15, 2006 @04:32PM (#15337487)
    NT is a hybrid. It has Microkernel facilities that are constantly being used for something different in each version. Early versions of NT were apparently full Microkernels, but this was changed for performance.

    No-no-no-no-NO! I swear this kills me... Why does this myth continue to propogate? The ONLY thing about NT that was EVER uKernelish was that it did alot of IPC (message passing) and that it implemented "personalities" (but it did so in a most decidedly non-microkernel way). Both of these traits were commonly associated with microKernels at the time, but regardless the things that ACTUALLY make a kernel a microKernel never existed in NT... EVER...

    1. All drivers that touched an I/O port HAD to be implemented in kernelmode. That restriction goes back to the original NT 3.1 release and was NEVER otherwise.
    2. Although filesystems are modular to a certain degree, the nuts and bolts of all filesystems have to be implemented in kernelspace.
    3. While initially GDI device drivers (i.e. graphics and printing) were implemented in userspace, this concept was thrown out in NT4. Btw, there was nothing especially microkernelish about this; X is implemented in a similar way to the pre-NT4 GDI as far as that goes. Graphics and printing after all are not generally an esential function of an OS from a functionality perspective.
  • Re:Still Debating (Score:3, Informative)

    by Guy Harris (3803) <guy@alum.mit.edu> on Monday May 15, 2006 @05:27PM (#15338023)

    4. Network stacks, at least up to the transport layer, are implemented in kernel space.

  • Mirror set up (Score:3, Informative)

    by MrPerfekt (414248) on Monday May 15, 2006 @07:01PM (#15338722) Homepage Journal
    I placed the IDE files on our mirrors server here at Easynews...

    http://mirrors.easynews.com/minix3 [easynews.com]
  • by jelle (14827) on Monday May 15, 2006 @09:56PM (#15339586) Homepage
    "99% of the time when using shm, you use a semaphore, another SysV IPC primitive."

    This is my opinion, but I had to say it: I personally don't like SysV. There are various ways to synchronize, and each method has advantages and disadvantages, but SysV is at the bottom of the pack if you ask me.

    process-shared pthread mutexes and conditions are much faster than SysV, because they usually don't make a system call. A disadvantage of the SysV ipc that process-shared pthread mutexes have too is demonstrated by the existance of a program called 'ipcrm': "to delete resources that were left by irresponsible processes or process crashes": When a process crashes (or gets a kill -9, or a similar rude interruption), it may keep a lock causing the other contenders to wait forever until somebody cleans up the mess.

    That is why fcntl()-based locks can be the right thing for your app, even though it's a syscall every time too like SysV. fcntl()-based locks are cleared as soon as you exit or close the filedescriptor (you can use the same fd as the one you used to mmap() the shared memory). That last behaviour, in turn, is a disadvantage of fcntl locks, because you lose the lock too if you close another filedescriptor referring to the same file (grunt)... There is also the flock(), which can hold only one lock per file (and I'm not sure, may have the same limitation as the fcntl() locks).

    None of these are 'perfect', but SysV has never been my choice yet... I have good hopes for the new 'robust futexes' in the very lastest 2.6 kernels, which should combine the flexibility and efficiency of pthread mutexes with the robustness of flock()...

    Oh, and yes: you can place a pthread mutex or condition in shared memory between processes (pthread_ZZZattr_setpshared(attr, PTHREAD_PROCESS_SHARED), it's not very well documented but it has been in the nptl for a while now and it works great).
  • Remember microkernel-loving theorists out there, we're talking about Minix, something quite alot _older_ than Linux.

    Pardon me? Sir? Sir? You seem to have diarrhea of the mouth and constipation of the brain.

    Minix 1 & 2 codebases are indeed older than Linux. And they could have *been* Linux, except that Tanenbaum was focused on teaching. As a result, he rejected the requests to add features, thus leading to the development of Linux.

    However, he has apparently decided that it's time to start a microkernel project to prove the concept in the modern world. As such, he's written a brand new codebase that's focused on the Microkernel architecture rather than producing something that is easily studied by students. As such, the Minix3 codebase is only 8 months old.

    If you'd taken the 5 minutes necessary to RTFA, you'd know that. In the future, please take the time to read before expounding on your obvious disgust with a concept you are not paying any attention to.

    Thank you, and please have a nice day.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (4) How many times do we have to tell you, "No prior art!"

Working...