Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Tanenbaum-Torvalds Microkernel Debate Continues 534

twasserman writes "Andy Tanenbaum's recent article in the May 2006 issue of IEEE Computer restarted the longstanding Slashdot discussion about microkernels. He has posted a message on his website that responds to the various comments, describes numerous microkernel operating systems, including Minix3, and addresses his goal of building highly reliable, self-healing operating systems."
This discussion has been archived. No new comments can be posted.

Tanenbaum-Torvalds Microkernel Debate Continues

Comments Filter:
  • Re:Andy Tanenbaum ? (Score:3, Interesting)

    by fistfullast33l ( 819270 ) on Monday May 15, 2006 @01:38PM (#15335892) Homepage Journal
    I know you're being facetious (comon, mod points for the SAT word), but for those who don't know, Andrew Tanenbaum is covered at Wikipedia [wikipedia.org]. His textbook, Modern Operating Systems, is probably one of the most widely used and excellent resources on the subject. He also likes to get into flame wars with Linus Torvalds when he gets bored. This is ironic because supposedly Linus used Tanenbaum's Minix as a starting point and influence for Linux.
  • Page based sockets? (Score:5, Interesting)

    by goombah99 ( 560566 ) on Monday May 15, 2006 @01:41PM (#15335905)
    It seems to me the whole issue boils down to memory isolation. If you always have to pass messages to communicate you have good isolation but costly syncronization of data/state and hence potential performance hits. And vica versa: Linux is prone to instability and security breaches from every non-iolated portion of it.

    As I understand it, as a novice, the only way to communincate or syncronize data is via copies of data passed via something analogous to a socket. A Socket is a serial interface. If you think about this for a moment, you realize this could be thought of as one byte of shared memory. Thus a copy operation is in effect the iteration of this one byte over the data to share. At any one moment you can only syncronize that one byte.

    But this suggests it's own solution. Why not share pages of memory in parallel between processes. This is short of full access to all of the state of another process. But it would allow locking and syncronization processes on entire system states and the rapid passing of data without copies.

    Then it would seem like the isolation of mickrokernels would be fully gained without the complications that arrise in multi processing, or compartmentalization.

    Or is there a bigger picture I'm missing.
  • Whatever... (Score:3, Interesting)

    by MoxFulder ( 159829 ) on Monday May 15, 2006 @01:49PM (#15335977) Homepage
    Linux is very reliable for me, even on newer hardware with a bleeding edge kernel. Why should I care whether it has a microkernel or monolithic kernel? Everything I deal with is user space. If it runs GNOME, is POSIX-like, and supports some kind of automatic package management, I'll be happy as a clam.

    Will hardware drivers be developed faster and more reliably with a microkernel? That seems to be the biggest hurdle in reliable OS development these days... Anyone have a good answer for that, I honestly don't know.
  • by HornWumpus ( 783565 ) on Monday May 15, 2006 @01:52PM (#15336000)
    Microkernels were too performance expensive in the CPU cycle starved 80s. The extra context switching overhead was unacceptable. (e.g. video performance of NT 3.51. IIRC 4 context switches per pixel drawn.)

    In the CPU cycle flush 00s the debate is just different. Less code running at ring0 means less code that can cause a kernel panic, blue screen or whatever they call it in OSX.

    A significant part of the market is OK running Java. The comparitivly small performance cost and high stability payoff of a microkernel makes the tradeoff a no-brainer.

  • Re:Andy Tanenbaum ? (Score:3, Interesting)

    by Herkum01 ( 592704 ) on Monday May 15, 2006 @01:55PM (#15336022)

    Linus has written the Linux kernel used in millions of computers ranging from PCs to Mainframe.

    Tanenbaum still has Minix and doctorate.

    Education means nothing if you do nothing with it. Linus has applied his education very well and progress well beyond anything Tanenbaum has accomplished, with or without a doctorate...

  • Re:Still Debating (Score:3, Interesting)

    by Arandir ( 19206 ) on Monday May 15, 2006 @01:58PM (#15336045) Homepage Journal
    That's a very good point, and one that people keep forgetting. If microkernels are so great, where are they? Let's take a look at notable microkernels:

    * QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.

    * Hurd. After twenty years we're still waiting for a halfway stable release. Hurd development is almost an argument *for* monolithic kernels!

    *Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.

    * Mach. People claim OSX is a microkernel since it is built on top of Mach. But that ignores the real world fact that OSX is monolithic. People have been misled by the name.

    * NT. This is NOT a microkernel! You don't believe anything else Microsoft says, so why do you believe this fairy tale?

    In short, QNX is the only successful real world Microkernel. Linus happens to be right on this one: microkernels add too much complexity to the software. From ten thousand feet the high level architecture looks simple and elegant, but the low level implementation is a fraught with difficulties and hidden pitfalls.
  • by rcs1000 ( 462363 ) * <rcs1000&gmail,com> on Monday May 15, 2006 @02:00PM (#15336060)
    Try doing what I do with Minix3: run it in VMWare, allocate it 4GB of RAM, and let VMWare do your virtual memory manegement.

    (Yes, I know it's an ugly hack. But it means I don't worry about giving Bash 120mb, and cc some enormous number...)
  • by after fallout ( 732762 ) on Monday May 15, 2006 @02:01PM (#15336079)
    As I read this, it seems quite analogous to objects in C++ (or any other OOL). All kernel interfaces could publish the data they want to have public, and hide the data that is private to the implementation of the feature.

    I would suggest that this will eventually make its way into kernel systems (just like any other good idea that has come from the programming language fields).
  • by nuzak ( 959558 ) on Monday May 15, 2006 @02:02PM (#15336091) Journal
    > But this suggests it's own solution. Why not share pages of memory in parallel between processes.

    This is precisely what shared memory is, and it's used all over the place, in Unix and Windows both. When using it, you are of course back to shared data structures and all of the synchronization nastiness, but a) sometimes it's worth paying the complexity price, and b) sometimes it doesn't actually matter if concurrent access corrupts the data if something else is going to correct it (think packet collisions).

    Still, if you have two processes that both legitimately need to read and write the same data, you probably need three processes. The communication overhead with the third process is usually pretty negligible.

    There's even more exotic concurrency mechanisms that exist that don't require copying or even explicit synchronization, but they're usually functional in nature, and incompatible with the side-effectful state machines of most OS's and applications in existence today.

  • by dpbsmith ( 263124 ) on Monday May 15, 2006 @02:15PM (#15336217) Homepage
    I have never experienced the "stalling" problem that affected a very small number of 2004 and 2005 Priuses last year. (OK, hubris correction, make that "not yet..." although my car's VIN is outside the range of VINs supposedly affected).

    It was apparently due to a firmware bug.

    In any case, when it happened, according to personal reports in Prius forums from owners to whom it happened, the result was loss of internal-combustion-engine power, meaning they had about of mile of electric-powered travel to get to a safe stopping location. At that point, if you reset the computer by cycling the "power" button three times, most of the warning lights would go off, and the car would be fine again. Of course many to whom this happened didn't know the three-push trick... and those to whom it did happen usually elected to drive to the nearest Toyota dealer for a "TSB" ("technical service bulletin" = firmware patch).

    These days, conventional-technology cars have a lot of firmware in them, and I'll bet they have a "reset" function available, even if it's not on the dashboard and visible to the driver.
  • by goombah99 ( 560566 ) on Monday May 15, 2006 @02:16PM (#15336225)
    Perhaps I'm mistaken, but Isn't shared memory essentially available to the entire macro-kernel and all it's processes. Something more fine grained like a page based socket would let two processes agree to communicate. They would be sending messages to each other over a very wide channel: the entire page, not some serial socket.

    Some other process could not butt-in on this channel however, since it's not registered to that socket.

    Or is that how shared memory works?

    Tnnebaum's point is that he can have a re-incarnation server that can recreate a stalled process. Thus by using exclusively message based communication he can assure that this won't disrupt the state of the operating system (because it's all isolated). The problem is when two processes need to share lots of data quickly. THen message passing gets in your way.

    Something that had the negotiated nature of a socket, yet allowed two processes to syncronize their large data structures without passing the entire structure serially would be ideal. then you could still potenitally have things like a re-incarnation server. A new process would simply have to re-negotiate the socket.

  • by StevenMaurer ( 115071 ) on Monday May 15, 2006 @02:20PM (#15336264) Homepage
    ...so I can't spend a lot of time in dicussing this, but I always that the main benefit of micro-kernels is completely wasted unless you actually have utilities that can work in partially-functioning environments. What good is it to be able to continue to run a kernel even with your SCSI drive disabled, if all your software to fix the problem is on the SCSI drive?

    Now in theory I could see a high-availability microkernel being a good, less expensive alternative, to a classic mainframe environment, especially if you had a well written auto-healing system built in as a default. But that would require a lot of work outside the kernel that just isn't being done right now. And until it is, micro-kernels don't have anything more to offer than monolithic kernerls.

    To put it in API terms - it doesn't matter very much whether your library correctly returns an error code for every possible circumstance, when most user level code doesn't bother to check it (or just exits immediately on even addressable errors).

  • by nonmaskable ( 452595 ) on Monday May 15, 2006 @02:27PM (#15336328)
    Tanenbaum as always makes a good conceptual case for his perspective, but as time has gone by his examples increasingly prove Linus' point.

    Except for QNX the software he cites are either vaporware (Coyotos, HURD), esoteric research toys (L4Linux, Singularity), or brutally violate the microkernel concept (MacOSX, Symbian).

    Even his best example, QNX is a very niche product and hard to compare to something like Linux.

  • Re:Still Debating (Score:5, Interesting)

    Forgetting something? [wikipedia.org]

    *Minix. This is still an educational kernel. A teaching tool. It remains unsuitable for "real world" use.

    Actually, it's a start of a full-up Microkernel operating system. This isn't your grand-pappy's Minix, it's a brand new code base under the BSD license, intended to be developed out into a complete system. It's still taking baby-steps at the moment, but it's coming along quite nicely.

    * NT. This is NOT a microkernel!

    NT is a hybrid. It has Microkernel facilities that are constantly being used for something different in each version. Early versions of NT were apparently full Microkernels, but this was changed for performance.

    * QNX Neutrino. This is the most successful microkernel ever. It deserves all the praise it gets. Yet it is still a niche product.

    I would hardly call QNX a "niche" product. Running on everything from your car engine to Kiosk PCs (yes, that stupid iOpener ran it too), it's an extremely powerful and versatile operating system. Its Microkernel architecture even gives it the ability to be heavily customized for the needs of the application. Don't need networking? So don't run the server! Need a GUI? Just add the Graphics server to the startup.

    Microkernels haven't failed. However, you may notice that nearly all the popular Operating Systems we use today were all developed back in the late 80's and early 90's. The real problem is that there hasn't been a need to develop new OSes until now. Now that Security and Stability are more difficult pressing issues than performance, we can go back to the drawing board and start designing new OSes to meet our needs for the next decade and a half.
  • A CPU like Kernel (Score:3, Interesting)

    by Twillerror ( 536681 ) on Monday May 15, 2006 @02:33PM (#15336370) Homepage Journal
    All of these ideas are old, and while high performing don't address the largest issue of all, cross kernel compatability.

    Sure you can recompile and all that jaz, but I'd love to see a day where an app could run on any number of kernels out there. This creates real competetion.

    What I'd like to see if a kernel more like a CPU. Instead of linking your kernel calls, you place them as if you where placing an Assembly call. Then we can have many companies and open source organizations writing versions of it.

    As we move towards multi core cpus this could really lead to performance leads. Where one or more of many cores could be dedicated to the kernel operations listening for operations and taking care of them. No context switches needed, no privledge mode switching.

    Drivers and everything else run outside of kernel mode and use low level microcode to execute the code.

    The best part I think is you could make it backword compatiable as we re-write. A layer could handle old kernel calls and change them to the micro codes.

    As we define everything more and more then we might even be able to design CPUs that can handle it better.

  • Re:Whatever... (Score:3, Interesting)

    by einhverfr ( 238914 ) <chris...travers@@@gmail...com> on Monday May 15, 2006 @02:44PM (#15336483) Homepage Journal
    Linux as a kernel is so reliable for me that the only times I have to use the reset button are when hardware malfunctions (usually something that a microkernel can't help with, like RAM, CPU, or the video card, though in the latter case, I tend to ust leave the computer running and ssh in from elsewhere...).

    I noticed two things about Tannenbaum's piece though. Essentially all of the microkernels he listed were either used in dedicated (including embedded) systems or were not true microkernels by his own admission. Or, like HURD, they were not commonly used in production. I would add Unicos/mk to the dedicated systems category because it is designed to run a single process efficiently across a large number of processes.

    So here we have it. Microkernels are *far* better in some environments where the computer has a single (and often well defined) task esp. on well defined hardware but for things like mainframes or network servers, they are not commonly used. I can only suggest that the tradeoffs are more subtle than are commonly discussed here.
  • by Anonymous Coward on Monday May 15, 2006 @02:49PM (#15336534)

    I think I'd prefer the Linux network stack, which AFAIK simply doesn't crash in the first place.

    Well... this is a more interesting comment than first reading might suggest. I've always been a bit dubious of "tolerant" software. It might sound counter-intuitive, but I'd rather have libraries/kernels terminate the running program and output a big message saying why rather than tolerate problems and try to continue. In the long-term it pays-off.

    A lot of problems in Windows come from Microsoft trying to build Windows to be extremely tolerant of crap software and bizarre library calls, and to keep running as long as possible... and that has come back to bite them. Lots of strange failures that never get fixed because they don't terminate the program, they just end up generating shite later.

    I prefer my libraries and kernels to just say... wrong. Fuck off. Or at the very least spazz out and crash spectacularly -- because those sorts of problems GET FIXED QUICKLY. It sounds like a hacky and roundabout idea... but it does work. It forces software to be fixed instead of being tolerant of its faults. Perhaps if you were in an academic environment and in total control of the entire software stack, the pure platonic ideal of development would work.

    But you aren't, and it doesn't. Crashes and terminated programs get noticed and the problems fixed. It's real life coding in a nutshell.

  • 'Way back when we read the first rev of this discussion, Tanenbaum made good points. At the same time, Linus was able his little monolithic kernel project jump through the hoops he wanted it to.

    Years later, Tanenbaum still makes valid observations, Linus and others continue to make a rather larger project jump through the hoops, and that's fine. The results of academic research may or may not get traction outside of a university, but without the research, there wouldn't be alternatives to contemplate. If I've gathered nothing else about Linus' personality from his writings over the years, it's that he seems to be practical, not particularly hung up on architectural (or licensing) theories... unlike me.

    At some point, if his current architecture just isn't doing it for him any more, he might morph into Tanenbaum's 'A' student. It won't be because a microkernel was always right, but that it was right now.

  • The Big Picture (Score:2, Interesting)

    by thisisnotmyrealname ( 944948 ) on Monday May 15, 2006 @03:12PM (#15336709)
    I just finished watching the original Connections series with James Burke.

    The start and end of the series hits it home:
    The dependence on technology, and the use of technology to get us out of trouble created by technology.

    About how one technology is so interdependent on other technology, that one failure can cause the whole technology section to fail.
    Initial reference to electricity grid failures, usual caused by one small part failing)

    Wouldn't building a system so that one part cannot take down all the other parts be important?

    How would the Internet be if the entire thing would fail if one part stopped?

    Yes, I want a computer without a reset button...
  • by Hard_Code ( 49548 ) on Monday May 15, 2006 @03:21PM (#15336766)
    I think it's because typically the "message" which is meaningful to most application is of larger granularity than a single byte, so it makes sense to instead synchronize around that message. Also, you want to ensure that the semantics of your API and your synchronization match. It makes no sense to preserve integrety without preserving semantics. The best way to do that is to either explicitly make a copy, or to "lease" the structure until such time as you are notified that all necessary work has been done. You need access to be atomic on the scale you are interested in. Not just arbitrary bytes. After all we already have instructions that are atomic on arbitrary bytes.
  • by the_duke_of_hazzard ( 603473 ) on Monday May 15, 2006 @03:29PM (#15336848)
    The Linux v Tanenbaum debate reminds me of the debates I hear at work between those who understand that software is a commercial kludge between conflicting/changing requirements, the limitations of time and abilities of support engineers etc., and those who want to do everything "right", often blowing development budgets, and producing unusable, over-optimised, hard-to-maintain code. I exaggerate the distinction for effect, of course. Linux has built a system, it works and it's used everywhere. Microkernels are all niche, and the benefits are debatable compared to the ubiquity and support for monolithic kernels. That said, I think Tanenbaum is funamentally right. But then I've had to train myself to be more like Linux than Tanenbaum to actually get things done.
  • by Russ Nelson ( 33911 ) <slashdot@russnelson.com> on Monday May 15, 2006 @04:15PM (#15337298) Homepage
    The other big issue is a lack of threading support.

    Threading is the spawn of the Solaris. Oops, I mean the devil. Forking was so slow on Solaris that they had to invent threading to have multiple contexts run at any speed. The whole point behind a microkernel is to HIDE information. Threading EXPOSES information between separately running processes, so you need to have mutexes, semaphores, and all of that synchronization crap that makes for buggy code.

    Threading is bad. Don't use it. When you have to use code that uses it, refactor the code to use processes or a state machine. It can be done. Don't whine. But don't use threads either.
  • by jonniesmokes ( 323978 ) on Monday May 15, 2006 @06:03PM (#15338380)
    More important than micro/macro to me would be the ability to keep the system running and edit the system. I used to do that with Scheme back in my college days. It made me realize how something like the telephone system could keep running 24/7 and never go down. These days with MS Windows I gotta reboot every 30 days, and the same with these fscking Linux kernel updates. What if I don't ever want to reboot. I think a microkernel/interpreter would let you modify the running system a lot easier. You could even make incremental changes and then check to make sure they work - preserving the old code so a rollback would be simple.

    The point that Andy makes which I agree on, is that computer software is still in its infancy. The part I disagree with is that it'll change by him stating the obvious.
  • by BitchKapoor ( 732880 ) on Monday May 15, 2006 @07:19PM (#15338844) Homepage Journal

    I/O channels would help IBM mainframe channels, which have an MMU between the peripheral and main memory...

    I've heard from a friend at Intel that their new chipsets which fully support TCPA have this feature. So maybe trusted computing isn't just about copy prevention.

  • Re:Let's see... (Score:3, Interesting)

    by Arandir ( 19206 ) on Monday May 15, 2006 @08:22PM (#15339183) Homepage Journal
    You don't time things when code is available, you time things when people are available to code. For both micro- and macro- kernels, the race started with the 80386 and the affordability of 32-bit CPUs for the average developer. The reason you didn't have Free Software kernels 15 years after the quasi-availability of UNIX source was that hardly anyone could afford the hardware. The ability to collaborate over a network also made a huge difference.

    And Minix 3 doesn't count for real world use. It may be a good starting point for one, and maybe Minix 3.5 might do it. But as of today it would be silly to put Minix on anything but a hobbyist system.
  • by POds ( 241854 ) on Tuesday May 16, 2006 @07:42AM (#15341189) Homepage Journal
    Has anyone thought of that the fact this very conversation may go down in the history of computer science? In 30 more or less years, lecturers will be telling their students about this argument! We're witnessing a more interesting slice of history than our normal mundane day lives :)

Our business in life is not to succeed but to continue to fail in high spirits. -- Robert Louis Stevenson

Working...