IBM Creates New Fastest Beowulf Cluster 154
shawnb writes "It seems that IBM has created the world's fastest Linux cluster built from lots of small servers (64 IBM Netfinity 256 servers). The Netfinity servers are linked together using "special clustering software and high-speed networking hardware, which causes the separate units to act as one computer, delivering a processing speed of 375 gigaflops, or 375 billion operations per second." They also go on to say that this is the fastest Linux supercomputer, "it will only rank 24th on the list of the top 500 fastest supercomputers. " "
Not 64 Netfinities... 256... (Score:1)
only rank 24th ? (Score:1)
You know what (Score:1)
Then wonder if we can make a Beowolf of these things! Oh wait, that's what they are doing with em...
Ummm, ahhh, ummmm... Grits anyone?
With only 64? (Score:1)
Re:Only 24th (Score:1)
It's probably a software limitation, but probably not a bad one. Large clusters get unwieldy quickly, and network latency and bandwidth is the bane of any parallel programmers existence. Communication between the nodes of a cluster is several orders of magnitude slower than referencing internal memory, and any no real parallel program has autonomous nodes. It's no coincidence that Donald Becker, a major contributor to Beowulf on Linux, also wrote huge chunks of the kernel networking, tons of network card drivers, and a network channel bonding implementation for Linux.
My point is that you can create a cluster with thousands of nodes, but doing so is an administrative and technological nightmare. For most parallel problems, it's much easier (and generally more efficient) to have a smaller number of more powerful nodes.
--
odds of being killed by lighning and
Re:Cost/performance (Score:1)
--
IO (Score:1)
http://theotherside.com/dvd/ [theotherside.com]
I wonder... (Score:1)
Mike
Re:PVM or equiv. in the Linux Kernel? - NO (Score:1)
Ideally you'll implement everything userlevel, including your networking, so you never go into the kernel.
Re:Uhm... I don't think so... (Score:1)
The UP machines might have a bus for each CPU, but the SMP machines' busses are still a lot faster than having to go through Ethernet for inter-CPU communication. Two SMP machines will beat four UP machines any day, at least until we get faster networking (or more importantly, lower network latency).
Re:Gigabit? (Score:1)
Re:RC5? (Score:1)
If you bothered to reag the RC5 FAQ, you'd see that they've been asked this question enough times for it to have an item in the FAQ all of its own. They say, basically, that the RC5 distributed network is pretty much the same as a Beowulf network in terms of processing efficiency for RC5, so a Beowulf would perform much the same as the same number of machines operating independently on the project.
Re:MS FUD (Score:1)
Microsoft could do the same thing, install the NT kernel on a bunch of machines and run a program on top of it to actually pool machine resources, if they felt like it, but that's not at all their target market. Really, no pure OS vendor is going to approach the super computer market. There's no demand for just OSes... people want machines. Since people are kind enough to develop Beowolf and Linux, vendors can play around with the idea of using Linux clusters.
Re:IO (Score:1)
Really, there are so few applications out there that are actually hard disk intensive. And for those, it's actually beneficial to use disk arrays for the redundancy.
Everything else can benefit from more RAM to avoid having to swap to the disk. RAM's solid state. Hard drives aren't. Don't expect them to ever have more than a very small fraction of the speed of RAM, cache, etc...
Re:Extreme Linux here (Score:1)
Re:Extreme Linux here (Score:1)
Not completely, there is also mosix [freshmeat.net].
Thanks for the correction ... (Score:1)
My background is variational methods (the theory of it) and I should have been more specific in narrowing the applications down.
Re:Purpose...? (Score:1)
finite elements, particle methods esp. in 3-dimensions (very computing intensive), finite difference (nobody uses that anymore IMNSHO), whatever they use in molecule modelling.
For what is that used:
crash simulation (cars etc.), airplane design, engine design, meteorologie (- very ugly because of 3-d problems) etc, fluid dynamics, nuclear weapons simulation (- somewhat ugly for other reasons).
There's no way there will be ever enough computing power even for very non esoteric applications. For instance imagine doubling the density of weather stations in three dims, this results in 8 times more input-data which causes 64 times more computing (theoretically) and perhaps much slower convergence of your solving algorithms.
Re:You can do it too! (Score:1)
I belive you're wrong there. Although beowulf clusters are nice for applications that are computationally intensive but don't require much bandwidth, a lot of problems require a lot of bandwidth. This is where SMP boxes rule since bandwidth between processors on a single motherboard/backplane is a LOT larger than across a network. For these types of problems, the time required is more function of bandwidth rather than cpu so a smp box with fewer processors but more bandwidth between the processors will beat even a large beowulf/clustering solution.
Re:Damn. Way too slow (Score:1)
yup. top500 [top500.org]
Network scalability (Score:1)
Of course that's only an educated guess on my part. If I'm incorrect, note that Myrinet is coming out with single unit 64 and 128 way switches later this year. That should help improve the interconnect situation for larger clusters a great deal. Prices will be dropping too, possibly putting Myrinet in reach of groups with smaller budgets.
--Jason
Re:SMP on the nodes? (Score:1)
--Jason
Re:Clusters and supercomputers compared (Score:1)
Tasks such as 3D rendering are not very communications intensive, so a beowulf-style machine with processors that compare to those in commercially available machines will run at about the same speed. Communication intensive tasks, such as meteorological simulations, don't run as well unless you shell out the big bucks for better interconnect.
The linpack benchmark, which solves a system of linear equations, is used to determine ranking on the top500 list. See the website for more info.
--Jason
24th on top500? Unlikely (Score:1)
Unfortunatly one chip isn't actually going to produce 733MFlops on Linpack. A PIII-500 gets about 200, which is 40% of the TPP. Dunno much about the 733Mhz chips (except the cache runs at full processor speed, but if it's only 512kb it's not going to offer much improvement), but I'll be nice and say it gets 75% of the TPP. Ok, I'm probably being REALLY nice there. That leaves us with ~240GFlops.
Of course, for a press release 375GFlops looks alot better
Jason
Re:Imagine (Score:1)
Re:Probably Beowulf (Score:1)
Re:Cost/performance (Score:1)
What exactly *is* a Netfinity 256? (Score:1)
In fact, I took a glance on the IBM web site and I couldn't find any such machine...
Am I blind or simply stupid?
Re:With only 64? (Score:1)
PVM or equiv. in the Linux Kernel? (Score:1)
There is a kernel-httpd in the development branch of the Linux Kernel already - So wouldn't a feature in the kernel to enable parallel computing or clustering be the next big step?
Re:Gigabit? (Score:1)
Re:Uhm... I don't think so... (Score:1)
Assuming they are using the Netfinity 8500R (8-way SMP), there could be a total of 2048 processors.
Re:Uhm... I don't think so... (Score:1)
I went with 10K Gigabyte which means all you need is 134 of those new 75 Gig hd's
So how about a 140 dual processors alpha systems with 500Mb ram, 75 Gig hd, dual fiberchannel
networking with a cray as the controlling host.
Oh pardon the drewl. ahhh ahah PORK CHOPS!!!!!
;P
Re: Lobos = Wolves (Score:1)
Of course it could just be a refence to the UMN Basketball Team The Lobos [golobos.com]
Gigabit? (Score:1)
Re:Analysis of Linux and Commercial software. (Score:1)
AND people have been making money out of Linux, certainly more money than Amazon makes.
Extreme Linux here (Score:1)
Byw, if anyone wants a copy of Extreme Linux (the beowulf clustering CD) that was made a few years back, drop me a line at nicksworld@netscape.net
I'll give you a place to download it, or if you send some $$$ I'll send you a CD copy of it.
Re:Extreme Linux here (Score:1)
Parallel Processing over Ethernet is a Beowulf Cluster.
Re:(OT)How does a first post get marked as redunda (Score:1)
Instead of referring to the post itself as redundant, maybe the moderator was referring to the content of the post. Oh heck, I don't know.
Re:RC5? (Score:1)
Dunno if this cluster is being used in the effort, though.
nice, but why aren't they doing this with PPCs? (Score:1)
More on Beowulf Clusters (Score:1)
Though I don't see anything about its speed in terms of gigaflops
And here's one press release here. [pccweb.com]
RC5? (Score:1)
Re:Purpose...? (Score:1)
The Los Lobos supercluster is part of the National Science Foundation's National Computational Sciences Alliance program, which gives scientists remote access to the fast machines needed for scientific research.
Re:What kind of network speed do you need? (Score:1)
Interesting that the Wired article claims that the machine has 64 nodes but that UNM press release says 256. I wonder what else Wired got wrong...
Re:maybe just maybe (Score:1)
I do know that they compared the NCSA NT Supercluster with the Linux cluster at Sandia National Laboratory. This is the second "large" cluster within the Alliance and sort of provides a counterbalance to our NT cluster.
You are quite right in that there should be no big difference between operating systems for applications that are largely computation intensive. The big differences would come from applications that are heavily file intensive or communications intensive. Both clusters use a high performance interconnect called Myrinet, but the NT Supercluster uses a messaging layer developed here at University of Illinois called Fast Messages which provides very low latency and high bandwidth. The last I heard, the Sandia Linux cluster used TCP/IP over Myrinet and I do not believe this offers as low of latency as FM.
Re:Answer to Greg Koenig - (Score:1)
One reason you might want to use NT as a target for deploying clustering technology is that it could be argued to be more widely available than Linux (right now). If I extend my definition of a "cluster" to include machines on peoples' desktops, then if my solution can utilize the operating system present on many of these machines, perhaps I can do something interesting. The fact is, people DO pay money for NT, so building technology on top of it may make some bit of sense.
Also, there is a Linux version of our clustering software available. It just happens that the NT version was developed first.
Re:maybe just maybe - how do they keep them up!? (Score:1)
But, your point about remote administration is quite valid. Actually, I'd extend this to being "any remote access" to NT as being the crux of the problem. Launching jobs on a group of remote NT machines and getting the stdout back to the submitting machine has definitely been a pain with this project. With a Unix operating system, you could simply rsh each of the components of your parallel job. With NT, we ended up using a piece of commercial software that provides this functionality, although there are issues with it that make it not as nice of a solution as you would have out-of-the-box with Unix.
Another related idea about NT being oriented to "the user sits at the console" as opposed to using the machine from remote comes when a job abnormally terminates. The NT debugger, Dr. Watson, tends to pop up and wait for the user to press "OK" to continue. If you're not sitting at the physical console, this is a problem. We've had to use some workarounds to deal with this situation.
So, all-in-all, while there have been some studies that suggest NT performs slightly better for some types of scientific applications and some interesting results have been obtained by using NT, there have been many many days when I have really wished we had used Linux instead. Fortunately there are BOTH NT and Linux clusters, so scientists can choose which one they think will work best for their particular application.
Re:LINUX SCALABLE, SOLARIS/IRIX/*BSD NOT! (Score:1)
The key to the problem is that you are comparing clustering to SMP, if I can put a 256 proc SGI O2k box out, it's an order of magnitude faster than a clusterd Intel box for most applications. To cover the rest of those applications introduce clustering, both SGI, SUN have clustering software.
So, you take advantage of the extermely fast SMP capability in your Sun or SGI and use the flexibility of clustering.... lets say you combine 5x 256 CPU SGI Origin boxes, and you've got a kickin' option because they only have to go over the EXTERMELY slow network bottleneck when they have to go between 5 machines, instead of going over the EXTREMELY slow network to go between the 1280 single proc Intel machines (or 160 8 way SMP boxes).
I'm not knocking Beowulf (side note Beowulf is NOT an IBM product), it's great for specific applications that need lots of power for low cost. But the ability to have 256 proc's access the same memory at extereme speeds is something that can't be ignored.
Hmmm... wanna rethink that light years statement?
--
Spelling and grammar checker off because I don't care
Re:Analysis of Linux and Commercial software. (Score:1)
What kind of network speed do you need? (Score:1)
Anyway, I'm managing a 60 alpha XP1000 cluster connected with a 100Mbit network and I'm using a Cisco Catalyst. If you check their page, you'll see that the Catalyst 8500 can have 128 port switch for 10/100 Mbit. While only 64 for Gigabit connections.
Indeed you should see how much your parallel jobs are communicating with each other and how much traffic they have to support. Sure you don't want to spend a fortune in gigabit connections if you're sending out very few packets.
I'm pretty confident that a 100Mb will do for most of you out there!
Check this page, for some more on the LosLobos!
http://www.unm.edu/~ paaffair/news/news%20releases/Mar21hpcc.html [unm.edu]
Re:Uhm... I don't think so... (Score:1)
Check the UNM press release at
http://www.unm.edu/~ paaffair/news/news%20releases/Mar21hpcc.html [unm.edu]
Re:Uhm... I don't think so... (Score:1)
Re:Purpose...? (Score:1)
Jeff
Re:Cost/performance (Score:1)
If they're not using dual procs, then I find it hard to believe that it's more powerful than this. [lwn.net]
Quote from site: The FSL cluster (called "Jet") currently consists of 276 nodes, organized into three long banks. The nodes are unmodified, off-the-shelf Compaq Alpha systems with 667 MHz processors and 512 MB of memory.
Re:Woops (Score:1)
Re:I wonder... (Score:1)
Re:maybe just maybe - how do they keep them up!? (Score:1)
Re:Damn. Way too slow (Score:1)
Re:Only 24th (Score:1)
24th doesn't sound bad to me, it's a start anyway.
From the article they say that clusters max out at 64 machines, limiting their size - but also it's claimed that the cluster acts like a single machine, so my question is, why can't you cluster the clusters to use 4096 machines. Is it simply a case of (lack of) bandwidth linking the machines together?
Re:maybe just maybe (Score:1)
Re:Purpose...? (Score:1)
Purpose...? (Score:1)
Is there a list of these computers to see what's ranked 24?
Re:Cost/performance (Score:2)
Also, the cost of the boxes is almost unimportant with something like this, you have to take into account the actual construction of the network (usually the most time-consuming part of any super computer) and the main cost is the support contract. You have to have people available to fix this guy on a moment's notice whenever it breaks.
SMP on the nodes? (Score:2)
Re:What kind of network speed do you need? (Score:2)
Your Working Boy,
Cost/performance (Score:2)
Also, it mentions the limitations of networking; can't you link together 3com (now defunct, I know) switches in a stack to make larger switches? If not, I'm pretty sure that we'll have larger switches in the next few years.
--
Re:Cost/performance (Score:2)
and just to be picky 64x10k=640k or over half a million!
Tom
Re:(OT)How does a first post get marked as redunda (Score:2)
That's it, the moderators are on drugs (Score:2)
I literally read it 3 times through to find the 'flamebait' there: nada. The only moderation down, which could have had a sliver of merit would have been 'overrated' but this is ridiculous. Hopefully this gets caught in meta-moderation...
Chris
Re:not Beowulf? (Score:2)
Beowulf != any of these. Beowulf is the idea that one can take commodity, off the shelf (COTS) components and build a powerful machine at a price far less then a comparable commercial offering.
Codes run on Beowulf, and really any parallel machine, typically use MPI, PVM, or custom message passing libraries. The beowulf idea includes the use of MPI & PVM, among other freely available software packages. Codes that run on shared memory machines typicall uses the shared memory device of MPI, shared memory, or pthreads.
For CPU intensive tasks the Beowulf idea is great. Codes that perform lots of disk I/O suffer, as adding higher performance (i.e. SCSI) disks increases system cost greatly. Communication intensive tasks perform the worst on beowulf style clusters compared to commercial computers, as the interconnect on beowulf-style clusters can't compare. For a relatively large increase in cost, one can use Myrinet [myri.com]. With Myrinet bandwidth and latency begin to approach that of the switch found on the IBM SP series of machines.
With high bandwidth, low latency interconnect technologies that scale well (e.g. Myrinet), one can build a cluster that outperforms a comparable commercial offering at, say one quarter to one eigth the price. The difference at that point is software. There's really not alot out there to configure and administer beowulf-style clusters, and commercial implementations of some packages beat the pants off of their freely available counterparts (compilers, for example). Until the software situation changes there is still reason to buy your big iron from IBM, SGI, and Sun.
--Jason
Re:not Beowulf? (Score:2)
As you note, the real power of distributed/parallel computing comes from the message passing libraries, most commonly MPI or PVM. Beowulf per se is almost nothing more than a label for the generic concept of distributed computing on Linux. The same thing can be done with any other reasonably modern networked computer you have lying around, even those running Windows - you can even mix OSes in a cluster, although this introduces new and interesting problems. (There are a serious lot of underutilzed cycles sitting out there on the corporate world's desktops if they're not running OpenGL screensavers...)
BTW: If the phenomenal success of Sun's E10000 Starfire has taught us anything at all, it's that where I/O is important, a big honkin' SMP box kicks cluster butt! Seriously, the interconnect technology between boxes just *can't* be fast enough to compete effectively with a huge multi-level crossbar packet switch like the ones in the E10K. Sun and the other SMP vendors can win here because they own the domain in which the simpler problem resides...
Don't assume by this that I'm against Beowulf clusters at all - they are a great and amazing thing, but there's more than one way to skin a cat, and Beowulf isn't the only path to Linux distributed computing.
Re:not Beowulf? (Score:2)
There are far more serious, industrial-strength solutions out there, things like MPI, PVM, LINDA,and IBM's own HACMP. (Note these cover a lot of ground and are not necessarily even comparable to one another.)
Beowulf (or any of the others listed above) is not automatically the correct distributed computing methodology. Selecting the proper solution for the job at hand is far more complex than you might imagine. There is a lot more developer activity on some of these than there is on Beowulf - MPI in particular is maturing rapidly and is used for solving big/tough problems in many of the largest companies in the world. (No particular MPI advocacy or bias, it just seems like I run into it more often than the others...)
Re:That's it, the moderators are on drugs (Score:2)
Re:IO (Score:2)
Really though, using a solid state drive as a cache for a disk subsystem is an easy way to enhance performance, and is already being sold. You perform a write - instant gratification, and wiht proper caching algorithms, you can get the same thing for reads. A multi-gig SS Drive can easily max out a bus. Multi-level caching is a necessity as speeds increase in systems.
In this sort of system, the interconnect fabric (as fast as it is) can still be a little bit of a bottleneck, too... A good cached RAID disk system on the one end can really keep things smoking, though.
Re:With only 64? (Score:2)
http://www.unm.edu /~paaffair/news/news%20releases/Mar21hpcc.html [unm.edu]
Re:IBM has done it again (Score:2)
Of course, it's a little more technical than just "a bunch of computers hooked up via high speed links (i.e. fast/gigabit ethernet) to provide a parallel solution for complex calculations", but there is a lot of info...
Re:RC5? (Score:2)
Re:Uhm... I don't think so... (Score:2)
http://www.unm.edu/~paaffair/news/news%20releas
"The National Computational Science Alliance (Alliance) will take delivery of a 512-processor Linux supercluster within the next month - a move that will give this nationwide partnership the largest open production Linux supercluster aimed at the research community. The new supercluster, called LosLobos, will be located at the University of New Mexico's (UNM) Albuquerque High Performance Computing Center (AHPCC), one of the Alliance Partners for Advanced Computational Resources sites."
Re:Uhm... I don't think so... (Score:2)
Re:Purpose...? (Score:2)
Finite elements are mainly used for objects/mechanisms, particularly structures analysis (incl. car simulation) and manufacturing (metal stamping).
engineers never lie; we just approximate the truth.
Re:Only 24th (Score:2)
On the other hand, there are algorithmic techniques for masking (network) latency (e.g. time-skewing), so it's possible to make better use of 'loosely-coupled' (relative to the algorithm's interconnect requirements) compute elements (machines/clusters/etc) than you may think.
-_Quinn
You can do it too! (Score:2)
All you slashdotters with three or more systems in your basement! Go! Get them networked! Load MPICH or PVM! This should be your mantra:
"If SuperID can do it, then I certainly can"
Win2k clustering only supports 4 boxes max (Score:2)
Re:maybe just maybe (Score:2)
Were the Linux cluster users using gcc/g77? It is well known that (at least for most scientific codes) you can get 50-100% speedup by switching from the GNU compilers to commercial ones from Portland or elsewhere.
If there is still a difference, then the next thing to try is the latest dev kernels, which have better SMP (if SMP nodes are used), and significantly faster disk io through the elimination of double caching.
Since most scientific apps should spend most of their eating user CPU cycles, I wouldn't expect there to be very much difference between one OS and another, however node uptime and more established remote admin are points in Linux' favour for big clusters.
Probably Beowulf (Score:2)
MS FUD (Score:2)
Re:Uhm... I don't think so... (Score:2)
Just one man's guess...
Re:Uhm... I don't think so... (Score:2)
In fact, they're using a very standard 100Mbps ethernet, while my impression is that the IBM supercomputer will be much more tightly coupled. The GP cluster is rate at 0.37 Thz, about the same as the IBM machine.
On a side note, genetic programming will be an ideal for distributed.net; it's disappointing that GP isn't being attempted there.
Uhm... I don't think so... (Score:2)
metrics (Score:2)
Re:RC5? (Score:2)
Honey Nut Beowulf Clusters (Score:3)
Squirrels.
That's right, squirrels.
It has nothing to do with software conflicts, or processers overstepping each other. That can all be taken care of with a little bit of clever coding and hacks/workarounds. But that doesn't take care of the squirrel problem. Everytime I finally work around all the network tomfoolery, and get the power for umpteen boxes managable, like clockwork the squirrels come.
And they come not in single spies, they come in batallions.
It's never the same. Sometimes they just chew the wires. Sometimes they try and make off with a box or two. What the hell does a rodent need with a computer?!?!? Wait, I don't want to know the answer to that. My repeated attempts at hunting down and exterminating the wascally bastards are met with comic hijinxs and failure. And as far as I'm aware, there aren't any open sourced squirrel repellant systems. I can't trust a proprietary system to not conflict with the many many tweaks I've made to the system. But alas, I'm stuck with my ACME catalogue and a variety of clever devices which only fail and fail again, each one making me look successively worse.
So let's hope IBM can manage a good rodent-security system, and release it back into the community. God knows I've tried. I'm sure they will realize the importance of this issue after the first few attacks. This is a much overlooked problem, but we need a solution. And as soon as possible.
thankyoutheend.
Re:not Beowulf? (Score:3)
yes there is.. (Score:3)
Re:maybe just maybe (Score:3)
Deploying an NT cluster was certainly a challenge in some ways that would have been easier with Unix, but not impossible. Some of our collaborators have published results favorably comparing the performance of the NT supercluster to that of Linux clusters, so there seem to be good reasons to continue building at least some technology like this on NT.
Imagine (Score:3)
Sorry couldn't resist.
not Beowulf? (Score:4)
just "special clustering software"
Re:(OT)How does a first post get marked as redunda (Score:4)
Damn. Way too slow (Score:4)
I like the comment that its "only" 24th. As though being only the 24th richest person alive, or only having the worlds 24th fastest car would also be something to be sneezed at.
Anyway, is there a list of world supercomputer rankings?