Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Technology

All about Clustering... 4

King Monkey asks: "Over the past year or so I have see several mentions on the Interet about connecting computrs together in order to pool processing power and resources. I have not yet however seen anywhere that exlains the differences between the various implementations. What is the difference (if any) between clustering, Beowulf and Parametric processing. These are just the ones I have heard about. I am sure there are more I have not heard about. I would also like to learn about these."
This discussion has been archived. No new comments can be posted.

All about Clustering...

Comments Filter:
  • There are two main kinds of clusters from what I can see. The supercomputer like Beowulf(ish) cluster and the High Availability Clusters. The multi-processing clusters(I'll call the Beowulfs for the sake of brevity) are designed to help in massive computations. The key element there is to speed up or parallelize? computations of large amounts of calculations, such as physics, computer imaging (like in Titanic Linux Journal [linuxjournal.com] has an article about Linux and the movie). Within these multi-processor nodes are multiple programming libraries, PVM, MPI, and others that allow you to write code that uses this new conglomerate system.

    The High availability cluster is something else entirely. These clusters are not built for speed, but rather reliability and distributed load bearing. It usually means a group of machines that behave to the user as if they are one. Kind of like a certain major website that we're on. You generally have one or two traffic servers whose job is to send requests to the computer that meets certain criteria. Perhaps you want load balancing web servers. The traffic computers would send some requests to one server, some other requests to the other server, and so on based on some predetermined criteria. It can also be used to make sure no requests go to a dead machine. There is some real good information on this out there, but the most easily digestible is probably at TurboLinux [turbolinux.com] and their High Availability Cluster solution and RedHat [redhat.com] with their Piranha solutions.

    I know that was oversimplified, but I hope that it helps.

  • First off it disappoints me that these types of questions get posted here considering this was answered a little over two weeks ago with this article on Slashdot itself. Linux Clusters Explained [slashdot.org] But, hey what are you going to do?

    The reason why I can see this coming up on Slashdot is because there really isn't a definitive guide out there that you can just kick back and read. \tangent\You would actually have to look. You know with a search engine or something. That's right "King Monkey" search engines exist.\/tangent\

    Dun Dun DUN!!! That is until now. According to O'Reilly's site http://www.oreilly.com/catalog/clusterli nux/ [oreilly.com], they have a book in the works. And it should be out in August sometime.

    But hey, don't let that stop anyone from actually searching for the information.

    Okay, okay I'll stop picking on King Monkey and Cliff. We all love Slashdot anyhow.

  • As others havce mentioned, there are clusters for super computing and for high reliability. I know nothing about the former, so concentrating on the latter:

    There are two main ways to go about dealing with failures. One is to have data on both hosts consistent at all times, and the other is to discover the needed data when the other fails. There are tricky issues with both, but normally it is obvious which to choose. Sometimes a mixture works - which is what we choose for the product I'm working on.

    An example of the all data on both is a database. You probably cannot discover at failover time who has appopintments when. So you write software to send a write command to two comptuers, each running your database. (Or more commonly you have a dual ported RAID disk so that when one comtpuer fails the backup can work with the master's disks) If your primary comptuer fails you can shift quickly and transparently to the backup. Some places will divide reads between the backup and the master (This doesn't work so well with shared disks, but works great for the two databases approach) SO that you never know which computer will get your request. It doesn't matter though as both are up to date.

    An example of discovery is internet routers. If your cisco 7000 fails your pull a backup off the shelf, configure it, and connect the cables. (I don't know about the 7000 series, but for the smaller cisco routers it is very common to buy two identical ones at a time, configure them identical, connet cables to one, and set the second on top of the first with no cables - not even power connected.) When you connect the backup it uses the standard protocols to figgure out the network.

    There are more examples. Like I said, in the project I'm working on now both make sense in different areas. Discovery takes longer to take over, but it doesn't have to worry about corrupted data.

  • I recommend an evening in bed with "In Search Of Clusters" [fatbrain.com], Pfister, Prentice-Hall ISBN 0138997098 (available much cheaper here [bookpool.com])

    It's not Linux specific, but it is a superb overview of the problems and solutions in low-end parallel computing. It also discusses the three favourite solutions (SMP, NUMA and clusters) in depth and goes over their strengths and weaknesses.
    --
    Cheers

"There is hopeful symbolism in the fact that flags do not wave in a vacuum." --Arthur C. Clarke

Working...