Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Technology

K12LTSP + MOSIX Howto 77

Paul Nelson writes "Richard Camp posted a very complete, step by step guide to building a MOSIX cluster. "...The objective of this howto is to guide the reader on setting up a Mosix cluster with diskless nodes. The setup is based on K12ltsp Project. This should provide an easily scalable system."
This discussion has been archived. No new comments can be posted.

K12LTSP + MOSIX Howto

Comments Filter:
  • No seriously, this stuff looks good.
  • smart posters that actually remember it's MOSIX not Beowulf?
  • At last... (Score:4, Insightful)

    by Usquebaugh ( 230216 ) on Sunday March 17, 2002 @05:47PM (#3177960)
    This is exactly where I feel Linux should be used. The idea of dumb terminals and a central server has proven to be the most cost effective way for companies to implement computer technology.

    It's becoming clear that Intel/AMD etc are going to crush most other general purpose CPUs. Be it with SMP or SMT or both. With the increase in PCI bandwidth coming and the heralded 64bit chips intel will start to take over more and more server machines. Remember in the steel industry people scoffed at mini mills, kodak scoffed at digital cameras etc etc.

    In the future most companies will have dumb terminals and a server room with racks of cheap intel boxes. The OS on the server will be fault tolerant to the max, oh I lost a node ahh well only 255 left. Uptimes measured in years. Hang on a sec that sounds like an IBM or SUN mainframe.

    What is rapidly becoming apparent is that network speed is now more important than CPU/MEMORY speed.
    • This is exactly where I feel Linux should be used. The idea of dumb terminals and a central server has proven to be the most cost effective way for companies to implement computer technology.
      [...]
      In the future most companies will have dumb terminals and a server room with racks of cheap intel boxes. The OS on the server will be fault tolerant to the max, oh I lost a node ahh well only 255 left.


      I'm trying to figure out what the benefit of this is. You'd have to maintain the user clients - which will still break down - and the server nodes on top of this.

      You get fault tolerance - but user terminals don't need uptimes of years with transparent failover. You get centralized administration - but there are many ways of making this happen with user workstations too (witness the NT systems here that re-image their own drives every week).

      Performance will always be worse with a centralized solution than with user workstations, because you have no local disk for fast scratch space (used by many applications in the environments I've worked in).

      If computers cost $10k apiece, I can see cost being an issue, but if the cost of hardware and maintenance for a user's machine is much, much less than the cost of the user sitting at the machine, I don't see any justification on the basis of cost either.

      How is this supposed to be a "most cost-effective" solution, again?

      [Disclaimer: I think dumb terminal systems are nifty; I just don't think they're useful under most business conditions.]
      • This is the promise not of centralized administration, but centralized information. Forget worrying about putting your documents on the file server - no matter what computer you sit down on has your document.

        And terminals are a much better solution than NT administration - there is nothing like the pain I feel watching my profile download when I sit down at a new station. And my profile is only 1.4 MB. NT requires manipulation of local settings, synchronization, etc. Terminals don't have local storage to sync, no local settings to change.

        (witness the NT systems here that re-image their own drives every week).

        If weekly re-installation of the operating system is used to keep people from changing the system, then you need security settings. If it is used to keep the operating system functional, then you need a better operating system. Other than that it seems like something a Windows Admin might do for fun ("hey guys, check this out!").

        Imaging now being at home. I presume you have a computer, maybe two. If they're the same OS, you probably have customized the interfacce identically. You probably move files back and forth between them depending on which you use. Now think about sitting down at either computer and changing settings for both at the same time. Having all the files in all the same places. All the same software installed. You update one program, both computers get the update... With me?

        That is why I think it's cool. Any other takers?

      • Mainting user hardware is a plug and unplug solution. The hardware is broken, switch in the new hardware. Keep it generic.

        Centralised administration for security/backups etc is far cheaper and more reliable than spreading it out. Also what happens if we need a faster machine, in a distributed environment every PC has to be upgraded in a centralised model only one has to be. Now the cost for the new hardware may be the same, unlikely but maybe, the cost of installation will not be.

        Performance will be no better or worse than that of the central server/network, no local storage. Nothing to tune, fiddle or break locally.

        The cost of any hardware/software directly affects the bottom line. It matters not how much the PC costs in relation to the users salary.

        Dumb terminals have worked for businesses for over thirty years. One of the reasons they work is that PCs are very lighty utilised by most users, a lot of seti cycles are a testament to this. If the machines are centralised then the cycles can be spread between many users.

        I think, I covered all of your points, you could attack a centralised solution on a number of points but TCO is not one of them.
      • because you have no local disk for fast scratch space


        no, but with the $100 you saved not buying a disk you could have an extra 256MB of VERY FAST scratch space.

    • Why are the dumb terminal people still hoping they'll be right in the end?

      When the price of a dumb terminal and a usable computer are within spitting distance of each other, what's the cost saving of a dumb terminal? Woo, I saved $50, but now I'll lose hundreds of times that in productivity while my user is waiting on the network when they could have gotten data from the hard drive.

      Here's a better and more likely scenario: The servers are missing. Instead, the user's desktops use their spare cycles to process requests and such while their user is deciding whether the next word in their memo should be "Sir" or "Madam". Transparent redundency, automatic backups, all for 'free'.

      In the future, servers will be reserved for those rare situations where they are actually and truly necessary, like ultra-large scale account processing (think Visa).

      Dumb terminals died with the birth of the $300 PC... and prices continue to drop. (BTW, I'm not counting display, because the dumb terminal will need one too.) There's just too many advantages to making sure there's power on the desk to sacrifice them to a 1970's computing architecture.
      • Re:At last... (Score:2, Informative)

        Distributed servers is an interesting proposal, and might be the solution in the end, but in the interim it is just not feasible. Clusters *always* run better when there is a central server to coordinate tasks. The algorithms for distributed coordination are just not there. Network protocols (P2P) for distributed file-sharing are getting closer, but are still not scalable without huge performance hits. And the most important point of all, $2000 will still buy a machine that is up to the task of serving hundreds of clients. When/if processing speed hits a ceiling (price/performnce) and home users no longer need twice the power they needed a year ago, distributed serving will be price effective. Until then, a structured environment offers so much more in terms of manageability that the price is worth it.
        • I was speaking in the future, as was the original poster. Right now, we've probably got the optimal solution for 'right now' technology. Or at least fairly closely optimal.
      • I'm not hoping to be right or wrong. There a benefits to the client/server model. TCO is not one of them.

        Initial purchase price of hardware is trivial it's the support and maintenace that kills companies. Downtime and tech salaries are huge costs.

        There servers go missing, not yet but maybe in the future, I don't see it though. But hell I've been wrong before :-) This would require a fast/very fast network, much faster than that needed for terminals. Network storage is hard to distribute. You get nothing for 'free' evrything costs it's all a matter of how much. All the points you make for this point apply in greater detail to a centralised server.

        Hell terminals have been around since the sixties. Maybe even before. Just because it was discovered a long time ago does not make it invalid. Power on the desktop is useless without a network.
  • You know, when Slashdot posts articles like these, I can't help but wonder.

    This should provide an easily scalable system

    Yeah, right. Like as if anybody who reads Slashdot is going to go "Cool! I'll go and build a Mosix cluster with diskless nodes now! I've always wanted an easily scalable system and just this looks like it might be it!".

  • The Mandrake Mosix Terminal Project [dynu.com] is extremely similar and is based on the k12ltsp concept. Check it out if you can. K12ltsp is great for rolling-out massive amounts of LTSP servers quickly.
  • MOSIX Deployment (Score:5, Informative)

    by PatJensen ( 170806 ) on Sunday March 17, 2002 @06:00PM (#3177991) Homepage
    There is a much better, easier way to deploy a MOSIX cluster. This article is poorly written, and missing several important sections. It is not for the kernel-phobic or beginner users. The preferred way to bring up a diskless cluster with easy tear down, no maintenance, and no network booting required is to use ClumpOS.

    ClumpOS is a bootable CD with network drivers that is pre-setup with a custom kernel that contains MOSIX and MFS out of the box with no work required. You can download and burn ClumpOS and then boot it on your slave machines.

    As far as building your MOSIX master goes, I prefer Debian with the prebuilt easy to deploy MOSIX packages and kernel patches. The links to find both are below:

    Clump/OS: A CD-based mini distribution [psoftware.org]
    MOSIX on Debian [infofin.com]

    MOSIX is a fun, extremely useful tool. Just remember when building your Debian kernel to make sure to turn ALL options on for MOSIX, this includes MFS. Otherwise, you will have weird problems with not being able to migrate processes to your cluster.

    -Pat

    • Re:MOSIX Deployment (Score:2, Informative)

      by Halo5 ( 63934 )
      The only problem with Clump/OS is that it doesn't handle multiple NICs well (it can handle it, but the installation process is a little tougher). This kinda sux because with MOSIX, its better to dedicate a NIC to MOSIX communication and use a second NIC for standard networking. The Clump/OS group is working to better this situation; hopefully they are making progress...

      On a brighter note, the Clump/OS has a cool monitor program called ClumpView, which looks awesome!
    • This article is poorly written, and missing several important sections. It is not for the kernel-phobic or beginner users.

      Is setting up clusters something "kernel-phobic" or "beginner users" should be attempting in the first place? Really, it is kind of funny to expect an article to be aimed at that audience.

    • > There is a much better, easier way to deploy a MOSIX cluster.


      There may be better ways to put together a cluster, but the K12Linux/Mosix combo may become common, because the "terminals" schools buy will probably be cheap pcs which are overpowered for strict x-terminal duty. It seems like a shame to let those processor cycles and memory go waste. At least until there is a big enough market for someone to create real x-terminals for the k-12 market.

  • Mosix rules (Score:2, Insightful)

    I think this is where clustering should be done, for now at least, at the thread level. Most programs are multi-threaded. Most people don't want to rewrite programs to support MPI or PVM. Lots of projects that previously had to implement their own clustering protocols can just utilize Mosix instead. If I could talk my boss into it, I would put Linux/Mosix on every desktop at work and have a giant Mosix cluster. This is the future of computing.
    • This idea is used at CERN. Many desktops belong to a cluster (managed with Condor), but only when not in active workstation use. Therefore full clustering effect only becomes at night, but then again the daytime desktop use is not slowed down by batch work.
      • Could you pass on a little more info on the CERN clustering? Several of the people I sport work closely with CERN and have mentioned it but didn't really understand the details. URL?

        Thanks!
        • I only know it from the user's point of view and can't tell much more. They also have dedicated clusters (i.e. no desktop usage). Both kinds of clusters run Linux but they probably have others as well (it's a huge organization, about 7 kpeople, so I don't know everything :-). Maybe the cerh.ch webpages and/or google will lead you further.
  • Using the Linux Terminal Server, this could be a good idea for schools. They usually have a bunch of computers with the same HD image. By using clusters, they would never need to upgrade the image on all computers, no need for expensive HDs and the sound level would probably fall quite a lot!
  • PXE (Score:3, Informative)

    by GigsVT ( 208848 ) on Sunday March 17, 2002 @06:30PM (#3178074) Journal
    Having set this up myself, it seems the author has a few misconceptions about PXE. These seem to be common, as I get into heated discussions on IRC with people who have never done this themselves, but seem to think they know better than I do for some reason. I may have some minor errors in my description below, but I think it's mostly correct.

    First off, his cluster isn't really diskless, since he uses floppies.

    PXE is an Intel specification, but it is open as far as I know. Intel provides binary only daemons for PXE for Linux. PXE is a way to get around the 640k limitation that is inherent when using the bootp(or dhcp)/tftp boot methods.

    PXE is not something that is supported in the kernel as the author implies. PXE is a userspace daemon that allows the workstations to download the whole kernel and also it can present some pretty complicated menus to the user. It is one type of bootstrap, and it is pretty complicated to set up. The PXE daemon for Linux isn't documented very well either, and requires some strange configuration of itself, and also of the DHCP daemon on the server.

    Basically, the way I understand it, the DHCP process begins normally from the workstation boot ROM, and the DHCP returns a specific value that tells the workstation information about PXE. The PXE client then connects to the PXE server, and the user is presented boot options, which can be complex.

    I didn't use PXE in my final cluster though, due to the extra complication. What I found out was that the SYSLINUX people write something called PXELINUX. PXELINUX is misnamed because it does not use PXE, rather, it is a bootloader that loads over the normal BOOTP/TFTP method, which is loads simpler to set up and maintain. PXELINUX should be thought of as a replacement for PXE.

    Without a boot loader, a lot of the docs say you can just send the kernel to the directly to the client. This would work, but iff your kernel is less than 640k, as tftp/bootp operate in real mode, and they have to download the whole thing before they begin booting. (BTW the docs on diskless setups in Linux are extremely out of date for the most part)

    With a raw kernel setup, it's also impossible to pass the kernel any boot options. It's the same as if you dd the kernel to a floppy device.

    I gained a lot of knowledge about diskless booting in modern Linux in my setup, if anyone wants me to write a book, I'm open to offers. :)

    -Gigs
    gigs(at)vt(dot)edu-cational
    • Re:PXE (Score:1, Insightful)

      by Anonymous Coward
      Your explanation has errors too.

      There is no inherent 640kB limitation to bootp/DHCP/tftp. A protected mode boot loader such as Etherboot can access extended memory. Even a real mode boot loader can access extended memory by using BIOS calls that date back to the AT.

      Of course you have to download the whole thing before booting. You wouldn't want the kernel to start on incomplete segments would you? But perhaps you imagine that it has to be copied out of the lower 640kB. Not so. The boot loader can place the bzImage kernel exactly where another boot loader like LILO would place it, at 0x100000. The initrd is another matter, this is a historic shortcoming in kernel support for initrd so it has to be relocated to the top of memory (well, not exactly the top, there are limitations there due to Linux, see setup version 0x203 for the extra variable which tells the boot loader the highest valid address).

      One of the real reasons for not sending a raw kernel is finally mentioned in your penultimate paragraph, the need to pass options. Another is that the ROM boot loader is general and shouldn't have to know anything about Linux. Hence you need a secondary loader like PXELinux or a wrapper around the kernel image, like Etherboot.

      My credentials? http://etherboot.sourceforge.net/

      I don't think I'll be buying your book anytime soon.
      • There is no inherent 640kB limitation to bootp/DHCP/tftp. A protected mode boot loader such as Etherboot can access extended memory.

        Yes, but that boot loader has to fit in real mode first, that was my point. The ROM boot loader itself can't download a file bigger than about 500k due to real mode considerations.
        • Re:PXE (Score:1, Interesting)

          by Anonymous Coward
          That's no sweat at all. 640kB is plenty. Etherboot runs in 48kB all up. PXE loaders are limited to 32kB but they make calls to the extension BIOS to get network services, whereas Etherboot has the NIC driver built in.

          And you still don't get the point about the boot loader being able to load into extended memory directly. Etherboot loads bzImages and initrds. People have pumped down huge ramdisks hundreds of MBs big. Believe me, real mode is no obstacle to writing into extended memory.
    • Re:PXE (Score:2, Informative)

      by Chad Page ( 20225 )
      With Debian and the 82559-based Intel NICs I've used with PXE, I did not need any special PXE settings, just pxelinux and the dhcp-server and tftp-hpa daemons in Debian woody/unstable. I've successfully booted a kernel with a ~30-31MB initrd.gz (decompressing to 128MB) and it's not difficult to use at all. I'm building three diskless cluster boxen using the D810EMO which also works very well for this. You can even load memtest86 to test the box, since it uses the Linux loader.
  • I would like to know if there are any really smart uses of this out there. Anyone using this at home for anything clever? Or at a small business/educational institution? I don't want to know about large companies doing 3D simulations with cluster.
    Something usefull that can be used at home would be interesting...
    • I'm using it at home, mostly it just crunches dnet packets all day. Dnet doesn't migrate over MOSIX due to its use of shared memory between threads, so a shell script in normal Beowulf style is used to start it:

      ssh user@node1 dnetc
      ssh user@node2 dnetc
      [...]

      Then a corresponding:

      ssh user@node1 killall dnetc
      ssh user@node2 killall dnetc
      [...]

      For something like dnet, this works well.

      I've also played with distributed John the Ripper, but it also doesn't parallelize, so the way it has to be done is break the target shadow file up into equal chunks, the same number of chunks as you have nodes ideally. John does migrate though, so you can start and stop all the processes on a single node, and MOSIX will migrate them out.

      I'm currently limited because I havn't set up MFS, which means I/O bound processes don't migrate. Once I set that up, it will open up the cluster to a whole new class of applications. Generally, MOSIX@home hasn't been as useful as I first thought it would be, as most desktop applications don't migrate very well, but if you do any heavy CPU bound stuff, then MOSIX might be for you.
  • OpenMOSIX (Score:4, Informative)

    by GigsVT ( 208848 ) on Sunday March 17, 2002 @06:38PM (#3178111) Journal
    Also, be sure to support OpenMOSIX [openmosix.org]

    Apparently MOSIX is going to go closed source, so test out OpenMOSIX if you can, the project is really taking off and has several contributers, but it needs your help in testing the kernels. OpenMOSIX is being sucessfully used in major installations now, so it should be fine for what you want to use it for, and also you won't be getting yourself going on a (soon to be) proprietary path.
    • Other than openmosix webpages, I see no mention of MOSIX going closed source. Could someone explain how that would work, given that MOSIX is right now a patch to the kernel? Are the maintainers planning on porting it to patch against a BSD kernel or go fully userland?
      • Binary-only kernel modules are allowed. Think Nvidia.
      • It's not clear yet, but there are strong signs that the GPL MOSIX code will be abandoned. The development of userland MOSIX is another strong sign. Moshe Bar is(was) very close to the development of MOSIX, and he said there are strong indications that MOSIX is heading for proprietary. MOSIX development was never very open anyway, it was tightly held by the Professor and his students for the most part.
      • 1. MOSIX was developed for other systems, including BSD and Solaris before the Linux version.
        2. About the MOSIX license: there in longer any mention of the GPL as the license for MOSIX on the MOSIX web site. There used to be one. The lawyers may enforce the GPL of the Linux kernel, but the lawyers can't enforce anyone to continue and maintain that project. This is why we now have openMosix.
        3. Moshe Bar had a very enlightening chat/intrerview at the SourceForge clusters foundry a couple of days ago: http://foundries.sourceforge.net/clusters/index.pl ?node_id=41457&lastnode_id=131
        4. Regardless of anything else, I thing Prof. Amnon Barak (the original auther of MOSIX) have done a great thing by releasing as GPL the versions he did and we should all be thankfull.
        5. Take this with a grain of salt, I'm going to be part of the openMosix team (still learning the code) and work for Qlusters. Prof. Amnon Barak may or may not have another story... ;-)
  • clustermatic! (Score:1, Informative)

    by Anonymous Coward
    http://www.clustermatic.org/

    Similar to MOSIX - bproc and a suite of tools for getting a diskless cluster up and going quickly. Very cool - used for clusters of 128->1024 nodes currently.

"When the going gets tough, the tough get empirical." -- Jon Carroll

Working...