Submission + - Experimental OS' for Evil Geniuses (arizona.edu)
First off, Scout is a networked/distributed OS for embedded devices. The first part is probably done better in Plan 9/Inferno, and the latter is probably done better in specialized embedded OS', especially as Scout is abandonware, but the sheer difficulty of combining these ideas in a space that could fit into a digital camera back when Scout was being developed means this actually has some rather ingenious ideas that are worth revisiting. Besides, for all the good ideas in Plan 9/Inferno, it's not an OS design anyone has exactly picked up and run with, and if Inferno hasn't been abandoned, it's as good as, so offers no advantages in that regard.
Barrelfish is perhaps the most recent and seems to allow you to build a highly-scalable high-performance heterogeneous cluster, for example, though I'm suspicious of Microsoft's involvement. Oh, I can believe they want to fund research that they can use to make their own OS' run faster, after all the complaints over Vista, but they're not exactly known for supporting non-Intel architectures. On the other hand, ETH Zurich are very respectable and I could see them coming up with some neat code. Anyways, the idea of having a cluster that can work over multiple architectures (ie: not a SSI-based cluster) is potentially very interesting.
But they're not the only guys doing interesting work. K42 (with HHGTTG references in the docs) is supposed to be a highly scalable, highly fault-tolerent OS from IBM, who quite definitely have an interest in doing precisely that kind of work. Given IBM is currently selling Linux on its HPC machines, it would be reasonable to suppose the K42 research is somehow related, perhaps with interesting ideas working their way into the Linux group. And if that isn't happenening, it damn well should, as directly as licenses and architecture permit.
The L4 microkernel group has been around for a long time now. Although microkernels have their issues, running modules in userspace has advantages for security and communicating via messages would (in principle) allow those kernel modules and kernel threads to migrate between cluster nodes — a major headache that Linux-based clusters (such as OpenMOSIX) have a very hard time solving.
One Open Source microkernel that does exactly that is Amoeba, though it has become abandonware. It's a truly amazing piece of engineering for distributed computing that is slowly becoming an amazing piece of dead code through bitrot. However, if you want to set out to compete with Kerrighed or MOSIX, this might be a good place to look for inspiration.
Then there's C5. Fortunately, not the one invented by Sir Clive Sinclair, but rather a rather intriguing "high-availability carrier-grade" microkernel. Jaluna is the Open Source OS toolkit that includes the C5 microkernel. Now, many are the boasts of "carrier-grade", but few are the systems that merit such a description. The term is usually taken to mean that the OS has 5N reliability (ie: it will be up 99.999% of the time). One of the problems in this case, though, is that if it requires additional layers to be useful, 5N reliability in the microkernel doesn't mean anything useful. You could build an OS that only supported
Calypso is a "metacomputing OS", which seems to be the latest buzzword-compliant neologism to describe a pile-of-PCs cluster. On the other hand, abstract and efficient parallel systems mean better utilization of SMP and multicore systems and therefore better servers and clients for MMORGs.
I think most Slashdotters will be familiar with FreeRTOS — an Open-Source version of a very popular real-time OS. This OS is being used by some members of the Portland State University's rocketry group as it is absolutely tiny and will actually fit on embedded computers small enough to shove into an amateur rocket. There's a commercial version that has "extra features". I don't like — or trust — companies that do this, as altering the number of pathways in code will alter the quality of the code that is left. Unless it's independently verified and QA'd (doubtful given the approach being followed), it is not safe to assume that because the full source is good that a cut-down version won't be destabilized. On the other hand, if you want a simple embedded computational device (for your killer robot army or whatever), FreeRTOS looks sufficiently general-purpose and sufficiently hard real-time.
There are, of course, plenty of other OS' — some closed-source (such as ThreadX) and some open-source (such as MINIX 3) which have some — indeed, sometimes many — points of interest. However, there's not much point in listing every OS out there (Slashdot would run out of space, I'd get tired of typing, and I'd rapidly run out of put-downs). Besides which, at the present time, the biggest problem people are trying to solve are multi-tasking on SMP and/or multi-core architectures and/or clusters, grids and clouds. Parallel systems are bloody difficult. The second problem is how to provide the option of having fixed-sized time-slices out of a fixed time interval — often for things like multimedia. This is not the same as "low latency". It's not even deterministic latency, except on average. (It's only deterministic if not only does a program have a guaranteed amount of runtime over a given time interval, but it ALSO has a guaranteed start-time within that interval.) What RTOS' normally provide is deterministic runtime and a guarantee that the latency cannot exceed some upper limit. From the number of times the scheduler in Linux has been replaced, it should be obvious to all-and-sundry that providing any kind of guarantee is extremely hard and — as with the O(1) scheduler — even when the guarantee is actually met, you've no guarantee it'll turn out to be the guarantee you want.
A third problem people have tried to tackle is reliability. There's a version of LynxOS (a Linux variant, I believe) which is FAA-approved for certain tasks (it has the lowest certification possible). There was, at one point at least, also a carrier-grade Linux distro, but I've not seen that mentioned for a while. If you include security as a facet of reliability, then there are also Linux distros that have achieved EAL4+ ratings, possibly EAL5+. Some of the requirements in these projects is mutually-exclusive, which is a problem, and clearly the way the requirements were implemented are or we'd be seeing projects evolving FROM these points rather than the projects being almost evolutionary dead-ends.
It would seem logical, then, to go back to the experimental kernels where the fringes of OS theory are being developed, dyed and permed. Study what people think might work, rather than the stuff that's already mainstream or already dead, see if there's a way to use what's being discovered to unify currently disparate projects, and see if that unification can become mainstream. Even if it can't, not having to re-invent everything is bound to speed up work on the areas that are least-understood and therefore in most need of work."