
SGI Installing Beowulf 46
anaZ writes " SGI today
announced that it will install the company's first 128-processor Linux(R) cluster at
the Ohio Supercomputer Center." It seems SGI will be using 32 Quad
Xeon boxes (1400L). I wonder (and hope!) it is just the first of many.
Cool (Score:1)
Re:Cool (Score:1)
Way to go OSC (Score:1)
I'm glad to see them get some new hardware to play quake on! Do you think this means they'll give away the Onyx2 machines they have now?
I thought it was only one processor per box. (Score:1)
Couldn't find a price for those 1400L's (Score:1)
But, how much are those SGI servers going to cost? The PHB's consider SGI to be a "name" that they would consider installing and they assume an SGI would cost the same as a Sun.
Also, Why put a groovy case on a machine that's going to sit in a darkened server room?
Keep dreaming! :) (Score:1)
Quad Xeon? (Score:1)
Surely the idea with clusters (and SMP, for that matter) is to use truckloads of cheap stuff to get a better result than one big expensive thing. Presumably 1024 P2-450's (or G3's) would prove to be cheaper than 128 Quad Xeon's and a damn slight faster into the deal. Probably cheaper than a T3E too.
Anyone feel like doing the sums? You may assume a BOOTP server obviating the need for 1024 hard drives, if you want.
Dave
Re:Quad Xeon? (Score:1)
Good.. (Score:1)
---
Three Cheers for Ohio.. (Score:1)
Maybe I should go back to school... wait, wtf am I saying?!?
-fester(licious)
Re:Quad Xeon? (Score:1)
I would guess that you are giving it a performance kick by keeping the CPU's close together where they can talk to ram and other CPU's on a medium that's an order of magnitude faster than 100 base network. I'm not thinking that you'd get a 4x increase over single proc boxes, but I'd guess that you'd get more bang for the buck, since you can (if coded for it) keep processes near a group of cpu's and don't have to send info over the slower network. This is where clusters get interesting, how can I keep my part of the program near the closest RAM and CPU's, so I don't have to go over slow links; and still get the full potential of the box.
tsuiter@midusa.net
Beowulf (Score:2)
Quad Xeon... yum. (Score:1)
128 of those boxes in one place, in one Beowulf cluster... I think I'm going to have an orgasm.
(Offtopic) Cray 1 vs. modern PC (Score:1)
just curious,
mike
Re:Quad Xeon? (Score:1)
Also, the quad Xeon's, while not a great deal, are probably not as bad as it looks because each node requires a case, power supply, a minimum of one network interface and probably more for a big cluster, a port or more in the Ethernet switch, and a hard drive. The price tag on all of these does add up, even when you're buying cheap hardware, let alone when it comes from sgi.
I think a lot (most?) Beowulf clusters do have local storage on each node, and for good reason. The network is usually a bottleneck already, and it costs a lot more than the price of a hard drive to upgrade beyond fast ethernet (per node, that is).
Re:I thought it was only one processor per box. (Score:1)
da'fly
Re:Quad Xeon? (missed point?) (Score:1)
1. A piece of code that needs access to >512 megs ram.
2. A piece of code that performs significatly better with >512k of L2 cache.
Although I am not a big fan of saying xeon=server there are cases where the xeon solves a problem MUCH better than the standard PIII. I hope that the appropriate research has been done in this case as apposed to the salesman walking in and saying "you need..."
Spinning off Cray, still in the supercomputer biz (Score:1)
SGI seems to be betting large chunk of their business on Linux. Is it a shrewd business move, or done out of desparation? Only the future shal tell. and, btw, at current market capitalization RedHat has become the second largest UNIX vendor (i.e. primary business based on UNIX). They could buy SGI and SCO at this point and still have some left over.
I bet this is why they spun off Cray (Score:1)
Who am I?
Why am here?
Where is the chocolate?
Some numbers... (Score:1)
Cray YMP-8: 8 CPUs, 250 MFLOPS theoretical peak per CPU, about 150-200 MFLOPS on real-world code.
Cray T94: 4 CPUs, 1800 MFLOPS theoretical peak per CPU, about 450-900 MFLOPS per CPU on real-world code.
Cray T3E600/LC-136: 136 300MHz DEC Alpha 21164s (8 OS/command + 128 applications), 600 MFLOPS theoretical peak per CPU, about 90-150 MFLOPS per CPU on real-world code.
SGI/Cray Origin 2000: 32 300MHz MIPS R12ks, 600 MFLOPS theoretical peak per CPU, about 160-200 MFLOPS per CPU on real-world code.
OSC Mk.1 Beowulf node: 2 400MHz Pentium IIs, 400 MFLOPS theoretical peak per CPU, about 80-100 MFLOPS per CPU on real-world code.
(Assuming 64-bit floating point throughout; the Intel chips don't suffer as much as you might think from this, as they do all FP internally with 80-bit precision and truncate to 32 or 64 bits.)
If you've ever wondered why people pay big bucks for Cray vector machines, let me sum it up in three words: sustainable memory bandwidth [osc.edu]. The T90 machines can sustain on the order of 13 GB/s memory bandwidth, and the J90/SV1 machines can sustain about 5 GB/s. By comparison, most workstation and PC systems can sustain about 300-500 MB/s on a good day with a tail wind.
Uhm HELLO? (knocking sound) McFly! (Score:1)
but with shared memory as well. Sgi has been
going this way for a while. Now that all the
Sun weenies have had to eat their hats about all
the shit they've slung about cc:numa, they too
have cluster-style machines in the wings, and
are pushing them HARD.
SMP/S^2MP/MPP and our little beowulfs are
getting closer together every day. (ever notice
that myrinet is 100% like the Cray t3e torus?)
I expect that clusters will continue to grow,
and companies like whatever sgi becomes will
push multi-cpu clusters on linux.
da'fly
Design Decisions, System Details (Score:1)
short end of the stick on this deal. We disagree, and here's why:
1. Reliability: The most common hardware failures in cluster systems are hard drives and power supplies. The 1400Ls have redundant power
supplies, and smaller numbers of nodes will generally have a lower component failure rate.
2. Migration Path: The "cluster of SMPs" model is in use in several of the largest computers currently in use, including the ASCI Blue Mountain and Blue Pacific machines. Users should be able to develop code on our cluster and then move their code to these much larger platforms with little additional effort.
3. Application Needs: We have several users with applications which need in excess of a GB of RAM and several GBs of temporary disk storage per node. Many of these applications are "legacy codes" from the vector machines which are difficult to parallelize using message
passing approaches, but which can be parallelized relatively easily on SMP systems using compiler directives. The architecture we have
selected allows this as well as multilevel parallel programming, using message passing between nodes and compiler directives within a node.
4. Flexibility: We have users in virtually all scientific and engineering disciplines, most of whom (50-75%) write their own code. We need a cluster architecture which can accomodate a mix of serial, SMP parallel, and MPP parallel applications.
There are also drawbacks to this approach, primarily related to memory bandwidth and the added cost for the quad processor nodes.
Here is a slightly more detailed description of the new OSC Beowulf than was in the press release:
32 compute nodes plus a front end node, each with
4 Pentium III Xeon 500MHz processors
2 GB RAM
18 GB SCSI-UW disk
1 Fast Ethernet interface
2 Myrinet interfaces
8 16-port Myrinet switches
various software:
SGI's modified Red Hat distribution
PBS queuing system
Portland Group and KAI compilers
AMBER (computational chemistry)
Gaussian 98 (computational chemistry)
Cactus (computational physics)
We will be posting further details at http://oscinfo.osc.edu/hardware/ [osc.edu] as things develop.
Sincerely,
Re:Couldn't find a price for those 1400L's (Score:1)
around $14000 using linux
Re:Beowulf (Score:1)
Oh... never mind...
Well, if you're an OSU student... (Score:1)
Do they need to hire any programmers?
We often hire OSU students as programmers, gofers, experimental test subjects (oops, did I say that out loud? :), etc. It's a great place to work. Stop by at the beginning of fall quarter or watch the OSU "green sheets".
Re:Cool (Score:1)
Re:RAM limit a problem? (Score:1)
Linus has stated that he doesn't want to 'kludge' the kernel to support more memory than 32 bits can address (only in 32-big CPU's, of course). This creates a limit of 2 gigabytes of addressable memory (Not 100% sure here).
Actually, it's theoretically possible to address up to 4 GB. SGI's "bigmem" [sgi.com] patch makes it possible to do this, but I think Linus has rejected the patch for the mainstream kernel. That wouldn't shop SGI from shipping kernels built with this patch, so long as they also distribute the source for it.
Will this limit any of your applications?
Not really. Our J90/SV1 and Origin system both have 16GB of memory, but I don't think we allow a single job to use more than 2-4 GB of memory. Large files is actually more of a problem than large memory; Gaussian can generate 20+ GB output files. Thankfully support of large files on 32-bit platforms seems to be coming along.
What was your reasoning behind not choosing a similar solution from an Alpha vendor? (64-bit CPU, much more addresses)
We have a fair amount of Alpha experience in-house already; several years ago we had a classroom cluster of DEC Alpha workstations, and we currently have a Cray T3E which is also Alpha-based. Our main concern with the Alpha was software availability, especially compilers. The Compaq Digital Fortran beta is a good start, but I would like to see the Portland Group and KAI compilers for Alpha Linux as well.
Re:Couldn't find a price for those 1400L's (Score:1)
Also, Why put a groovy case on a machine that's going to sit in a darkened server room?
Our supercomputers sit in a glasshouse in full view. You have to give the public something to look at or the $$$ won't come rolling in anymore. :) I'm really glad our serves aren't ugly gray or beige boxes! You want a very fast computer look the part.
Cheap != Inexpensive was Re:Quad Xeon? (Score:1)
Just because a car is cheap doesn't mean it is not going to cost you in terms of repairs and shoddy engineering down the road. What people forget is that price is a function of scale of economy and functionality. Would you base a car purchase purely on the number of cylinders and torque? Similarly the computer purchasing experts look at the overall system, the balance between components, cost of parts, availability of drivers and software, cost of ownership and human learning curve. Personally I think too many people without a clue are hoodwinked by fast talking salesdroids and bells and whistles. If you look at the hard evidence, you might even find that MIPS chips (in the Origins) give better sustained real-world application performance for a certain class of problems than even the highly touted Alphas. Also if you look at say SGI's O2, you find that it is designed for easy rack-mount maintenance. All this comes at a premium.
Sheesh
LL
Right in my backyard! (Score:1)
Re:I bet this is why they spun off Cray (Score:1)
-- Reverend Vryl
Re:Cool (Score:1)
Re:I bet this is why they spun off Cray (Score:1)
You know, I wish you folks would come back to reality with this Beowulf crap. It may be a cheap way to weld a bunch of little bargain basement peecees together but no Beowulf abomination is going to do a specific Cray job like a Cray. I am assuming none of you folks that are making these claims have any experience whatsoever in the Science and Engineering disciplines on data mining, visualization, financial forecasting, and aerospace that cray machines are so well suited for. Cray's market share may be shrinking at the moment but that is only due to management geniuses and bandwagon jumpers that think this whole linux thing is going to save their company some how.
Linux is nowhere near the stability, scaleability, and performance of a polished proprietary unix like UNICOS. I cringe when I see you kids want to install your precious linux on a T3E so it will be "really fast". A T3E is running the best possible operating system for its architecture, as is the T90 and SV1 (know what those are?). No linux flavor is going to do these professional machines justice, let alone make them BETTER.
Keep it in the basement of 14-25 year old transluscent skinned pseudohacker types living at home because BitchX and Gimp are the only apps those kids are going to run. Oh, and have mom make me an extra grilled cheese sandwich.
-Patrick Krekelberg
Institute of Electrical and Electronics Engineers