IBM Demos Cray-Matching Linux Cluster 129

Posted by CmdrTaco on Wednesday March 10, 1999 @06:03AM from the look-what-I-can-do dept.

An anonymous reader sent us a link to an InfoWorld story where you can read about IBM slapping together an Open Source Supercomputer capable of matching a Cray on PovRay benchmarks. It's basically just a cluster of Xeon based Netfinitys. Smooth.

This discussion has been archived. No new comments can be posted.

IBM Demos Cray-Matching Linux Cluster

Load All Comments

Search 129 Comments Log In/Create an Account

Comments Filter:

IBM WAS (Score:1)

by John Campbell ( 559 ) writes:

Certainly isn't any worse than whatever genius at Microsoft decided to name their embedded OS "WinCE". I mean, yeah, I wince whenever I think of the thing, but...

And I thought marketing was supposed to be Microsoft's -strong- point...
IBM move Caldera, Debian, SuSE out of the picture (Score:1)

by six11 ( 579 ) writes:

If any of them were any good, IBM (and others) would invest in them.
reconsider that. what you say isn't logical. You're saying that the fact that IBM and others only invest in RedHat proves that the other distros don't have merit, but I could line up thousands of slashdotters who'd argue that debian or suse or whatever kicks redhat in the arse. I'm a redhat user, mostly because that's what I have used in the past, and it fits how I need the OS to install/function. I'd imagine that IBM and others are choosing RedHat to be their Linux prodigy child because it's a smart marketing move. From where I stand, redhat is the frontrunner in the corporate world, and companies will just run with that because of redhat's established name.
The other distros have qualities to them that are better for some people than redhat's distro... IBM picking redhat is purely a marketing move and says very little about the quality of other distros compared to rh.
What if? (Score:1)

by gavinhall ( 33 ) writes:

Posted by Olaimi:

What if IBM had this package in a box scheme along with
- UDB (AKA DB2)
- Visuage Age suite
- Lotus Notes (Dominos)
- e.commerce
- well the list is very long i guess !

I bet Microsoft have no future in corporate IT Departments!

Cheers ..
Real-time rendering? (Score:1)

by Ami Ganguli ( 921 ) writes:

Ok, so how fast is this? Does the benchmark measure how long it takes to render a "typical" frame? If so, does that mean it would take (3 seconds/frame * 24 frames/second * 17 nodes) 1224 nodes to render a movie in real-time?

Something like that could make a really cool video game. Of course, in ten years your Playstation will be able to do it.
IBM WAS (Score:1)

by Smack ( 977 ) writes:

IBM has been around long enough that I think they've run out of acronyms! For example, WAS already stands for "warehouse administration system" and "work activity system". And that means it's one of the least used set of initials out there!
Interesting.... (Score:1)

by mackga ( 990 ) writes:

I especially liked the part when they said they got Linux from a bookstore the day before. Heh.

"Hey, Bob! We got this nifty IBM cluster here. What'd you want to do with it?"

"Wait a minute. I'll run out and grab a Linux cd from the bookstore down the street."

Hahahahhahaha. That's fucking great!
hmmm (Score:1)

by ptomblin ( 1378 ) writes:

But look carefully at that entry for the Dual Pentium IIs, and you'll see that the total cost of the system is listed at $12,000. Either that is one KICK ASS system, or more likely it's a Beowulf cluster of dual P-IIs, and they forgot to mention how many CPUs were involved.
Not a very big T3E in that test... (Score:1)

by Troy Baer ( 1395 ) writes:

That was only an 64 processor air-cooled T3E, the smallest one SGI makes. SGI makes Origin 2000s twice as big (128 CPU SMP) and T3Es *32* times as big (2048 CPU MPP). Also, I'd be more impressed if IBM has pointed to a more widely used (and less temperamental) benchmark than parallel POV-Ray, which ends up being mainly I/O bound. NAS Parallel Benchmark numbers would be nice.

Beowulf clusters are nice if you've got a parallel problem that only scales well out to a moderate number of processors (32-64 at most), or if price/performance matters more than raw performance. They get clobbered if the problem is bounded by I/O, communication latency, or per-CPU memory bandwidth.

Frankly, IBM has a lot more to worry about from Beowulf clusters than SGI does. Their supercomputer class machine, the SP, is just a cluster of rackmount RS/6000s with a very high speed internal network, and it has all the same problems as a Beowulf cluster relative to a more tightly coupled parallel system like a T3E or an Origin. Plus, AIX is eeeeeeevil; IRIX is much nice IMHO.

And before anybody asks, yes, I work with both traditional supercomputers (Cray T94, Cray T3E/600-136LC, SGI Origin 2000/24xR10-250, IBM SP-2/8) *and* a Beowulf cluster. We've been doing benchmarks to compare our Beowulf to our big machines; in some cases, the Beowulf wins, and in others, the big machines win. It really depends on the problem. We (i.e. my group at OSC) may be announcing some benchmark pages here in a few weeks.

--Troy
Slow down, people! (Score:1)

by red_dragon ( 1761 ) writes:

Before you keep posting messages about how unfair the comparison was, if you read the article again, you'll notice that the point they want to convey is not how fast the Linux cluster was compared to the Cray, but instead how easy and inexpensive it was to set it up and get running, using just a few x86 boxes (I admit that Netfinities aren't exactly what I think about when I hear the word "cheap") and software that can be acquired for free. Damn, they got the software from Barnes & Noble! How much easier than that can it get?
why not use SCSI? (Score:1)

by red_dragon ( 1761 ) writes:

Hell, no, not SCSI for internod communication. Not even Ultra2. Yes, it's considerably faster than Ethernet in data transfer rates, but (a) you can't put a switch on a SCSI bus to avoid having a node wait for another one to finish sending data to start transmitting; (b) a device in a SCSI bus can't arbitrarily send data to another device in the chain; (3) you're limited to up to 16 devices in the bus (I'm not sure about whether recent developments change that limitation, though). It would take a fugly hack to make it suitable for the job. Fibre Channel and FireWire could be better choices, but I don't know jack about them, so no further comments.
Where's NUMA (Score:1)

by red_dragon ( 1761 ) writes:

Not yet for Linux, but wait for SGI to come up with some ccNUMA stuff in the not-so-distant future... or so it seems to be.
The question is... (Score:1)

by LightLiner ( 1785 ) writes:

Is the comparison to a Cray fair when you consider
the inter-node communication/bandwidth needs?

In clustering and parallel computation, bandwidth
counts. My guess is that a different application
that requires much more communication between
nodes, the T3E would step on the Netfinitys. 100
megabit ethernet does it for low-communication
jobs, but what about those that require much more
intensive inter-node communication.
Yet another x86 Abomination (Score:1)

by Jeff DeMaagd ( 2015 ) writes:

---Check the pricing for Alpha vs. PII. Crap or not, the Intel chips are cheaper. It's just a matter of how the cards are configured.---

Oooh graphics cards are expensive aren't they?

You have not covered the performance aspect. Alphas systems have twice the FPU power of any intel system at the same price, new (per MHz is a little different, but cost-performance is more important than CPI. For clustering, that is very good. Remember that all new Alphas currently have 64 bit PCI slots, like such used for gigabit or four port duplex 100bTX cards, reducing memory system bottleneck and increasing raw comm throughtput for parallel cluster/node computing like this. Communication is the key for parallel.

JRDM
36 PII-400 Xeons match 48 Alpha-450's... --- OLD (Score:1)

by Jeff DeMaagd ( 2015 ) writes:

The comparison you make is bad... The Cray test was done about 15 months ago AND used older software (POVRAY 2.2 vs 3.02), older compilers, older generation CPU, and no one uses 450MHz Alpha CPUs anymore. Cost wise, a new DS20 dual Alpha computer is less expensive than a new quad Xeon-450 AND outperforms it.

If Microway makes a new cluster, its memory performance/bandwidth will probably multiply by 10 given the new chipset.
No Subject Given (Score:1)

by GrenDel Fuego ( 2558 ) writes:

Deep Blue runs on PowerPC processors. There's a port of Linux for PowerPC processors.

The cluster may be able to outrun big blue... but if you install Linux on Big Blue, that should speed it up quite a bit i believe.

Then again, I seem to recall something about Linux being not quite THAT scalable, although I could be wrong..

Can Linux handle thousands of processors?
Big Blue / Power PC's (Score:1)

by GrenDel Fuego ( 2558 ) writes:

Try http://www.starbridgesystems.com/Pages/technology. html

It's actually information on another supercomputer, but it compares the system to the IBM Blue Pacific.

Under the processors section, it says that the IBM system uses "5,856 Power PC 604 processors"
hmmm (Score:1)

by tgd ( 2822 ) writes:

I think what's interesting about this is what it didn't say -- that the old record of 9 seconds was on a Dual Pentium II. So having *18 times* the number of processors only got it three times the performance...

Any comments on this? Obviously a dual Pentium II is pretty damn good at this too, being only 1/3 the speed of a $5.5 million supercomputer. Anyone have any idea why adding so many processors to the Linux cluster would improve results so little?
hmmm... I'll hazard a guess on this one... (Score:1)

by tgd ( 2822 ) writes:

That's a point, but I doubt that's the cause in this case. I suspect its something else. If I was rendering images (as opposed to large amounts of calculations that deal with the results of other sets of calculations), then I'd just split the image up into 36 chunks and have one processor blast through each chunk, and stick them all back together at the end. My assumption is that's how this test works because they mentioned how a few scanlines were dropped when one node went off line. So its not true PVM-style clustering like Beowolf.

In this case, 18 times the processors should give 18 times the speed -- unless the test really isn't processor bound. I'm not sure what it would be bound by, however... lousy implementation? I/O? Network?

If the test isn't really processor bound then the comparison to the cray is meaningless, because there's something wrong with the way the software is coded to work on parallel machines, I'd think.

I disagree that the $12k cost means it was a cluster of Pentium II's. I've bought a couple Pentium II systems in that range, its easy to get up there when you add a lot of RAM, lot of harddrive space, etc. Using name parts jacks the price up a lot. (ie, VA's selling systems with Intel boards rather than supermicro or some other lower-cost company...)

On a side note, I remember reading a year or two ago that someone was working on a networking layer that allowed IP and other protocols to be routed between cluster machines over a 40MB/sec SCSI bus. Anyone know if that ever got to completion? A four-fold jump in network speed would make quite a difference to I/O bound applications. (And SCSI cards are a lot cheaper than Gigabit ethernet or other real high-speed networking technologies...)
Poor SGI and others (Score:1)

by tgd ( 2822 ) writes:

I doubt it...

SGI's NUMA architecture means data can be pumped *much* more quickly between nodes, 100-1000 times as fast. Network-based Linux clustering is useful only for calculations that are fairly self-contained and don't need a lot of data to process.

What I think would be more interesting, given SGI's leanings towards supporting Linux on their MIPS and Intel platforms, is if they eventually tweak the multiprocessing in the kernel to support NUMA style multiprocessing and I can throw Linux on an Origin server. Or maybe better yet a NUMA-architecture Intel machine (i'm not really up on floating point speed comparisions between newer MIPS and newer Intel chips). Since they've dropped real PC-compatibility on their new Intel machines, that sort of a shift is a lot less painful than the initial dropping of support for DOS/16 bit apps.

So SGI doesn't get hurt by Linux. Linux *can't* really compete with a Cray at any real-world tasks (not yet...). And SGI is in a *real* good spot to be the ones selling the Linux-compatible hardware that actually could. In which case, why would they care? Their profit may be lower on a $500k Cray-comparible NUMA linux system than on the Cray, but I'd bet they'd sell enough more of them to make up the difference.

Time will tell.
Second Place Results - NOT 1 machine (Score:1)

by tgd ( 2822 ) writes:

Ah, that clears it up. Its always good to hear the real deal from the source. :)
No, but 2.2 is an update to RH5.2 (Score:1)

by slothbait ( 2922 ) writes:

...which is equivalent to loading a service pack in the Windows world. How many people run stock NT 4.0? Even clueless users know better: they make sure to keep up with service packs. Using RH 5.2 plus updates is as close to "off the shelf" as a standard production line Windows system.

--Lenny
not really... (Score:1)

by slothbait ( 2922 ) writes:

Clusters are definately a better idea for something as blatently parallel as non-real-time rendering. Commodity parts are vastly more cost effective. However, this is a very special case. Most applications require far more bandwidth for heavy interprocess communication. Try such an application on a Beowulf-type system and you can watch it fall flat on it's face. Suddenly computation is I/O bound, and the Cray really earns it's keep.

Things like Crays are expensive mainly because they have very special, very fast hardware for this purpose. It may be extraneous hardware for something straight ahead like a render farm, but there are many cases where such massive bandwidth is very necessary. Thus, for most applications, replacing a Cray with a Beowulf cluster just isn't an acceptable solution.

Beowulf clustering has been proven to be a cost-effective non-real-time rendering system, however.
--Lenny
Could it be off the shelf? Close enough (Score:1)

by Paul Carver ( 4555 ) writes:

I'm still running 2.0.36 but I believe a 2.2 kernel rpm is available. If their copy of Linux was Red Hat based couldn't they have bought it off the shelf, installed it, and then downloaded and installed the kernel rpm? I'd say that if they have to get a single file via free download and type 'rpm -i' that still counts as off the shelf.
Impressive failover (Score:1)

by mikemcc ( 4795 ) writes:

I was most impressed with the graceful failover. Unplugging one machine and having nothing more dramatic than a slight delay in one portion of the result is the kind of presentation that really makes an impression with "results" type people - you know, the ones who say, "I don't care how it works. Just show me that it does work."

I'm also pleased with IBM's recent decision to release their Websphere Application Server on Linux - although the person in marketing who thought up that name should be demoted. The acronym is "IBM WAS." Both passive and past-tense. Sheesh!
hmmm / Check single benchmarks (Score:1)

by Akira1 ( 5566 ) writes:

I noticed some of those results too. I am thinking that a governing body isn't involved with this process. Several other VERY strange results permeated the benchmarks. I downloaded the results and will keep em around just for kicks.
Yet another x86 Abomination (Score:1)

by Jeffrey Baker ( 6191 ) writes:

Wouldn't it be possible to export all of your Linux header files over to, say, a windows box, compile the code with VC++, then link it against the appropriate libraries?

People did this with BeOS when CodeWarrior/x86 was spitting out terrible machine code. Say what you must about M$, but their x86 codegen kicks serious butt.

For alphas, how about building the code on DEC Unix with the cool compilers?
IBM move Caldera, Debian, SuSE out of the picture (Score:1)

by GypC ( 7592 ) writes:

Shouldn't your 13-year-old ass be in school right now?
Get a clue.
.
Could it be off the shelf? Close enough (Score:1)

by jnazario ( 7609 ) writes:

we did it off the shelf. we're a redhat and kernel mirror, so we built our own 2.2.x kernels on top of a 5.2 install. we tweaked, of course, but hey, if you are building a supercomputer or a cluster, you better get set to tweak.

100bT and a switched hub, DEC Tulips bought on sale, donated hardware, etc... we paid approx $400 for our 8 node cluster.

i mean, who the heck wants to drop in a 2.2.x kernel rpm? c'mon! sources have been out for a while....
Where to find software? (Score:1)

by wald0 from j00nix ( 8225 ) writes:

Where does one find software capable of clustering? I've looked at MOSIX, which seems like it might work. But I've heard a lot about Beowulf (excuse the ignorance of this next part). Is Beowulf a software package I can download somewhere? What other options are there for linux openbsd or freebsd? Thanks..
Deep Blue (Score:1)

by trb ( 8509 ) writes:

It's misleading to say that Deep Blue [ibm.com] runs on RS/6000 processors. The chess engine is all in custom VLSI chess processors, the RS/6000 just acts as a control processor, which isn't particularly interesting as a supercomputer application.
Impressive failover (Score:1)

by Guy Smiley ( 9219 ) writes:

The failover is part of PVMPOV, and has nothing to do with the configuration of the systems. It was coded this way because it used to run on a room full of unreliable machines that ran at wildly different speeds that other people were using, and even if the machine didn't die, it was possible that a system was busy with other things at the time, so I didn't want to wait for renderings to finish when other CPUs were idle... Check out the PVMPOV Home Page [ucalgary.ca] for more info. Yes, 32s was impressive at one time (it used to be at the top of the list).
A few problems here... (Score:1)

by MuyJuan ( 9379 ) writes:

First, a few posters forgot their reading glasses and failed to notice that there were 17 machines, each of which had 36 PIIs in them. If you could have done that with 17 machines having only ONE PII, THAT would be news.
Second, the article says they used Xeons. I don't know about what prices IBM gets, but the cheapest I could find a Xeon was about $700. At this price, just the Xeons would cost half a million. The $150k price tag on this setup is just unbelievable, unless either (a) I really misunderstood how many processors they have, and/or (b) $150k was just what they had to buy in addition to what they already had lying around.
A few problems here... (Score:1)

by MuyJuan ( 9379 ) writes:

So that would be what? 16 2-headed Xeons and a 4-headed Xeon? 6 4-headed Xeons, 1 2-headed Xeons and 10 1-headed Xeons? I figured since it was IBM they could come up with practically anything on short notice. Wait...what am I SAYING? Anyhoo, nowhere does it say 36 total Xeons. Also nowhere does it say 36 for each server. I just did the math and balked at 2.11765 processors per server.
A few problems here... (Score:1)

by Axe ( 11122 ) writes:

ts.ts. 36 Xeons overall. read carefully next time.
aint no 36 headed xeons arounds.
Partially correct (Score:1)

by Jason Abate ( 11235 ) writes:

Overall it depends on the application. GCC is not that bad on Alpha integer. The performance loss is mostly in floating point and math libraries.

This shouldn't be true for much longer. Compaq released their math library for the Alpha last week (see here [digital.com] for details), and, acording to posts to comp.lang.fortan they will be releasing their Fortran compiler as well (as a commercial product, not for free). This should make Alphas much more appealing for cluster use.

-jason
Entry level supercomputing (Score:1)

by Jason Abate ( 11235 ) writes:

In vectorcomputing, each node computes a very small part of the big picture, making communication time a very big (or small in this
case) bottleneck. Thus these computers need super fast, specialized networking connections.

Umm, I think you're confused. Vector computers, such as the older Cray machines, use special vector processors that can operate very efficiently on long vectors of data, applying the same operations (hence the name). Things get a little more confusing with later machines which are actually parallel-vector computers (i.e. they had multiple vector processors that worked in parallel).

It is generally accepted that parallel computers, whether they are "big iron" type machines, such as the T3E, Origin 2000 or SP2, or clusters of workstations and PCs, are the way to go for high-performance computing. Of course, some people would point to the latest vector machines from Japan to contradict this...

You are right that the true measure of performance is based on applications, and there are applications suited to each of these architectures. We have found that for our problems (large-scale reservoir modelling) clusters of commodity PCs perform quite well in comparision to an SP or T3E, even with 100 Mbps networking, but there certainly are other applications with more fine-grained communication requirements for which even a T3E or O2k is barely sufficient.
Yet another x86 Abomination (Score:1)

by arivanov ( 12034 ) writes:

And a very expensive one:

Check www.microway.com [microway.com] for an Alpha cluster priced at 2500$ per node and $4,500 for the master console. This means that for the $150000 used by IBM one could assemble a 50+ node alpha cluster instead of 17 PCs...

God, when will people ever learn that x86 just does not worth it...
Partially correct (Score:1)

by arivanov ( 12034 ) writes:

Microway has NDP compilers and libraries that are comparable to Compaq's. Unfortunately not all of them are available for Linux. Unfortunately they also cost money.

Still, I would bet that you can actually get a very decent special deal if you purchase all the stuff together.

Overall it depends on the application. GCC is not that bad on Alpha integer. The performance loss is mostly in floating point and math libraries.

Anyway, I will bet for the Alpha for most of the cases ;-)
And what about the 1G network to support this (Score:1)

by arivanov ( 12034 ) writes:

Sorry dude. This does not scale. You are going to start going into some real network equipment problems and heavy expences after 16-24 nodes.

So getting a bunch of sloppy boxen is not an idea. There has to a compromise between box speed, box quantity and price of network equipment.
IBM SP2 (Score:1)

by TA ( 14109 ) writes:

Er, the IBM SP2 inter node bandwidth is nothing to write home about.. I'm working with these beasts, and the SP switch can do around 30 megabytes/second at maximum. If you reach 25 MB/s in real life then you're lucky. I haven't heard about any big improvements on that speed on the newest models either, in any event they must increase that speed by several orders of magnitude if you want to compare with e.g. Origin 2000. And considering the price of an SP2 rack the price/performance is, eh, interesting..
TA
IBM SP2 (Score:1)

by TA ( 14109 ) writes:

Oh I have read that page, and I have used all the tuning tricks in the (IBM) books, and the SP switch bandwidth *still* sucks. Other companies we work with on a big project have done a lot of testing as well, and the TCP bandwidth is just bad. As I said, if you get 30MB/sec in real life then you're good (and don't even think about UDP, that's really terrible). Now, we're not using the latest and greatest hardware, the nodes we use are 133 MHz. But we don't see much improvement from the 66 MHz nodes.
No, I'm not impressed by the SP switch. And besides, it's a terrible beast to work with.
TA
IBM SP2 (Score:1)

by TA ( 14109 ) writes:

Yeah, we're using slightly old models, as I mentioned in another posting. That's the deal with IBM, however I didn't think they're pushing 1994 models on us! The first rack had 66MHz nodes and the HP switch, the next (which came a couple of months later) had changed to the SP switch (less reliable from our experience btw), the newest nodes are now 133MHz which are still a bit behind the specs you can find on the latest and greatest.
It's interesting that you have measured 100MB on the latest equipment, the application should in theory be running on new hardware when it gets operational. It's very useful to have an idea of how the switch will perform, so thanks a lot for that info.
TA
Amusing that IBM Didn't Bench Against RS/6000 (Score:1)

by InitZero ( 14837 ) writes:

It's amusing to note that IBM didn't compare the Linux cluster to its own hardware. I'd be curious to know how a 12-way RS/6000 S70/S7A running AIX under HACMP would stand up to the Netfinity/Linux assult.

InitZero
hmmm (Score:1)

by David F. ( 15140 ) writes:

Well, (this is just speculation) I don't think they just used one Dual PII. I'm mainly making this guess based on the cost ($12,000). Perhaps the Dual PII is what one node is, and they have several of these nodes?
A proud Debian user (Score:1)

by Tooky ( 15656 ) writes:

I use Red Hat myself (well mandrake, but it amounts to the same), but what Big Blue have done is exactly what you ghave just said they ahve highlighted the open source community ina way that no one else has the profile to do...the reason they used RedHat is becaues its the distribution that was in the back of the book they bought, not becaues its any better or worse than any other distribution, or UNIX derivative, I wouldn't have been surprised to see the same article arounf FreeBSD but it was RedHat and Linux...and mighty top stuff it was too!!
Could it be off the shelf? (Score:1)

by Daiv ( 15693 ) writes:

Notice that 5 of the top 10 systems were Linux based - from
http://www.haveland.com/cgi-bin/getpovb.pl?searc h=Parallel%3A&submit=List+all+Parallel+Res ults.

D
Amusing that IBM Didn't Bench Against RS/6000 (Score:1)

by SoftwareJanitor ( 15983 ) writes:

It's amusing to note that IBM didn't compare the Linux cluster to its own hardware. I'd be curious to know how a 12-way RS/6000 S70/S7A running AIX under HACMP would stand up to the Netfinity/Linux assult.

This demo was done at a Linux show. Linux on RS/6000 is still a work in progress, so it is not surprising that they aren't ready to show that. Doing a demo with AIX at that show would have been a political faux pas.
Could it be off the shelf? (Score:1)

by bnf ( 16861 ) writes:

Perhaps Redhat (and Barnes and Noble) has an amazing distribution model, but this chart [haveland.com] says they were running Linux 2.2.2. I don't think a 2.2.2 kernel could have been pulled off of the shelf so soon after becoming available.
bnf
Impressive failover (Score:1)

by Gumber ( 17306 ) writes:

The graceful failover is a red herring to me. It was ascribed to IBMs "X architecture" but it sounded more like an application level adaptation.
Yawn (Score:1)

by Gumber ( 17306 ) writes:

Bogus:

Neat but not really. A cluster of PCs connected over fast ethernet is not as flexible as a Cray. On the other hand, a Cray is a waste of money for rendering.
hmmm... I'll hazard a guess on this one... (Score:1)

by CodeShark ( 17400 ) writes:

but experts out there in /. land, feel free to tell us if & how I'm wrong.

There has for some time been a rule of thumb that adding a second (or third, or 17th) processor to a problem doesn't get you double the performance, because there is overhead deciding "which processor is going to do what."
What this means is that at some point, adding parallel processors to a problem ceases to be cost effective answer.

See my other note on the main thread (this one got submitted first, so be a little patient!!) as to what I think is of greater long-term significance to the Linux world.
Leading us to the future. (Score:1)

by CodeShark ( 17400 ) writes:
(BTW This post is entirely IMHO. (in my humble opinion, for new /.'ees)
This demonstration all about mind share -- something that Microsoft doesn't want Linux to achieve. Let me explain.
For many years, Cray build absolutely the highest performing mainframe number crunching computer in the world -- and every computer scientist knew it. We used to joke about having our own desktop Crays -- if someone would just lend us (in this case) $5.5 million dollars per workstation. So here's the point that IBM was really trying to get across to the corporate IS people out there -- that Linux is competitive with anything Microsoft can produce. Follow the steps:
1. Take off the shelf hardware. [in this case, IBM NetFinities.
2. Take a common Linux distribution (RedHat, from the back of a book purchased at Barnes and Noble
3. Give a set of assumedly talented engineers an EXTREMELY limited period of time (they bought the book ONE day before Linux World) to:
  install and configure a set of 17 parallel machines,
  
  set up the network,
  install the software,
  Tune the installation...
[Note: in my book Just setting up the machines to run in parallel in the time they did is awesome enough!] But IBM specifically to demonstrate to the world that this chose this relatively inexpensive Linux cluster could match the performance of a Cray.
Whether or not it could be done with NT-based machines misses the point.
Although I'm not always a fan of Big Blue, in this case we should all thank them for a great job in once again proving the power of Linux to the rest of the computing world.
Take that, Microsoft!!
Xenons vs Celerons (Score:1)

by BogoNick ( 17940 ) writes:

I have not actually built any Xenon cluster (or Celeron cluster for that matter), but when you run an application like povray, why the heck would you need such large L2 cache anyway? The bottle-neck is still with the FPU and I/O activity such as the network. FPUwise, a 450mhz xenon does not outperform an o/c'ed 450mhz Celeron.
hmmm (Score:1)

by Soko ( 17987 ) writes:

It's math - you require double the ponies to get the time in half - and it gets worse as computational times approach zero. For an analogy, in 1980 a TF Dragster could do the 1/4 mile in just under 6 seconds, with ~2000HP. Today, it takes 6000 HP to do the 1/4 in 4.5. It would take 9000 to get under 4 they say - and so it goes.
Interesting.... (Score:1)

by HR Pufnstuf ( 18095 ) writes:

My brilliant coworkers woulda loaded Win98 on it with no hesitation.
The question is... (Score:1)

by Grit ( 18830 ) writes:

Last year Jim Gray (now at Microsoft Research, bleh) was out here at Stanford giving a talk on what he thought the future of computing was. He seemed to think that clusters were the way to go--- when I asked him about the latency issue, he seemed to be of the opinion that all the interesting computational tasks of the future _were_ "embarassingly parallel", and that anything that wasn't was pretty much good already.

I'm not sure I agree with that assessment... But I'm just a dumb systems researcher. What do I know about applications? :)
Fair where it matters... (Score:1)

by GroundBounce ( 20126 ) writes:

Everyone knows that a cluster won't perform well for computations that can't be easily parallized without massive internodal communication - No one would use a cluster for these types of problems. The point is that it _is_ a fair test for the types of computations that you _would_ use a cluster for. For these types of applications, you're better off spending $150,000 for a PC cluster than millions for a Cray.
Is Avalon faster then Cray? (Score:1)

by Kiaser Zohsay ( 20134 ) writes:

The whole point of Avalon was cost effectiveness. That's why they passed on rack mount cases.
Just in case anyone was in a coma all last year and doesn't know what Avalon is, here's [lanl.gov] the link.
I wonder if the Avalon folks ever tried anything as trivial as Ray Tracing.
Entry level supercomputing (Score:1)

by fletch_f_fletch ( 20685 ) writes:

Ok, someone who actually knows what (s)he's talking about.

There's nothing special about this news other than the fact that the individual nodes are running linux. Which basically makes this an SP2 minus the superfast network (and the dent in the wallet).

The measuring stick for all computer hardware issues is application. There is supercomputing (vectorcomputing) (like Cray, traditionally) and there is parallel computing (like any old cluster of workstations). The distinction is the type of operation. In vectorcomputing, each node computes a very small part of the big picture, making communication time a very big (or small in this case) bottleneck. Thus these computers need super fast, specialized networking connections. There are many problems/programs which may be parallelized and yet, still have a significant sequential segment, causing the bulk of the processor cycles to be spent on processing, as opposed to waiting for data communication.

The problems described by the later are becoming more and more popular. Vector computing, however, is primarily core scientific applications (physics, math, weather prediction, etc.) which have not seen dramatic computational advances in the last decade.

An SP2 is sort of in the middle of the spectrum since on top of having high powered nodes, it has a fast network. A COW running linux catches the bottom end of this spectrum, it's nodes are high powered by its network is slow. With ethernet, fast ethernet, or ATM it could never match the performance of Crays or Connection Machines. But then what do you expect for $2000 a node.

Also
trivial problem (Score:1)

by fletch_f_fletch ( 20685 ) writes:

I concur.

(by the way, I bet I could probably piss further than you)
36 PII-400 Xeons match 48 Alpha-450's... (Score:1)

by Corbett J. Klempay ( 20854 ) writes:

Whatever. These are old school Alphas...who would build a new Alpha 450 now?? Besides...not to knock Linux clusters (our ACM chapter just brought one online), but this is kind of a bad comparison...as this kind of stuff doesn't show the HUGE difference in internodal bandwidth between these two systems. If you get something that needs a lot of talking between nodes going, the Cray would pretty much rape the cluster like no tomorrow...the latency on switched fast Ethernet (even Gbit Ethernet) just can't compare to these whack (and horrendously expensive) supercomputer interconnection systems.

CJK
Where's NUMA (Score:1)

by Corbett J. Klempay ( 20854 ) writes:

I don't know if that's a fair statement...
Sure, the architecture of the system may be old school, but that's not why people set up Beowulf clusters...they buy them for the untouchable price/performance for coarsely-grained problems. End of story.
Don't worry, though...Linux development won't stand still...we'll see changes in the future to allow for more flexible architectures.

CJK

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

IBM WAS (Score:1)

IBM move Caldera, Debian, SuSE out of the picture (Score:1)

What if? (Score:1)

Real-time rendering? (Score:1)

IBM WAS (Score:1)

Interesting.... (Score:1)

hmmm (Score:1)

Not a very big T3E in that test... (Score:1)

Slow down, people! (Score:1)

why not use SCSI? (Score:1)

Where's NUMA (Score:1)

The question is... (Score:1)

Yet another x86 Abomination (Score:1)

36 PII-400 Xeons match 48 Alpha-450's... --- OLD (Score:1)

No Subject Given (Score:1)

Big Blue / Power PC's (Score:1)

hmmm (Score:1)

hmmm... I'll hazard a guess on this one... (Score:1)

Poor SGI and others (Score:1)

Second Place Results - NOT 1 machine (Score:1)

No, but 2.2 is an update to RH5.2 (Score:1)

not really... (Score:1)

Could it be off the shelf? Close enough (Score:1)

Impressive failover (Score:1)

hmmm / Check single benchmarks (Score:1)

Yet another x86 Abomination (Score:1)

IBM move Caldera, Debian, SuSE out of the picture (Score:1)

Could it be off the shelf? Close enough (Score:1)

Where to find software? (Score:1)

Deep Blue (Score:1)

Impressive failover (Score:1)

A few problems here... (Score:1)

A few problems here... (Score:1)

A few problems here... (Score:1)

Partially correct (Score:1)

Entry level supercomputing (Score:1)

Yet another x86 Abomination (Score:1)

Partially correct (Score:1)

And what about the 1G network to support this (Score:1)

IBM SP2 (Score:1)

IBM SP2 (Score:1)

IBM SP2 (Score:1)

Amusing that IBM Didn't Bench Against RS/6000 (Score:1)

hmmm (Score:1)

A proud Debian user (Score:1)

Could it be off the shelf? (Score:1)

Amusing that IBM Didn't Bench Against RS/6000 (Score:1)

Could it be off the shelf? (Score:1)

Impressive failover (Score:1)

Yawn (Score:1)

hmmm... I'll hazard a guess on this one... (Score:1)

Leading us to the future. (Score:1)

Xenons vs Celerons (Score:1)

hmmm (Score:1)

Interesting.... (Score:1)

The question is... (Score:1)

Fair where it matters... (Score:1)

Is Avalon faster then Cray? (Score:1)

Entry level supercomputing (Score:1)

trivial problem (Score:1)

36 PII-400 Xeons match 48 Alpha-450's... (Score:1)

Where's NUMA (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals