Become a fan of Slashdot on Facebook


Forgot your password?
Silicon Graphics

SGI Installing Beowulf 46

anaZ writes " SGI today announced that it will install the company's first 128-processor Linux(R) cluster at the Ohio Supercomputer Center." It seems SGI will be using 32 Quad Xeon boxes (1400L). I wonder (and hope!) it is just the first of many.
This discussion has been archived. No new comments can be posted.

SGI Installing Beowulf

Comments Filter:
  • Cool, very cool...
  • Extremely cool.
  • As a former employee of a company with offices at the Super Computer Center (The Greater Columbus FreeNet []) Way to go!

    I'm glad to see them get some new hardware to play quake on! Do you think this means they'll give away the Onyx2 machines they have now?
  • I dont know, but when I think of a beowulf cluster, I think of seperate machines. I guess 32 quad processor boxes are legal :) "... Yeah, and Microsloth is going to own the world .... oops."
  • V. Cool.

    But, how much are those SGI servers going to cost? The PHB's consider SGI to be a "name" that they would consider installing and they assume an SGI would cost the same as a Sun.

    Also, Why put a groovy case on a machine that's going to sit in a darkened server room?

  • The new OSC Beowulf is going to be used as a compute engine -- I'm not even sure if the 1400Ls have graphics cards! I don't see our SGI MIPS graphics hardware going anywhere soon. :)

    --Troy Baer, Systems Engineer, OSC Science & Technology Support
  • I feel like one of us is vaguely missing the point...

    Surely the idea with clusters (and SMP, for that matter) is to use truckloads of cheap stuff to get a better result than one big expensive thing. Presumably 1024 P2-450's (or G3's) would prove to be cheaper than 128 Quad Xeon's and a damn slight faster into the deal. Probably cheaper than a T3E too.

    Anyone feel like doing the sums? You may assume a BOOTP server obviating the need for 1024 hard drives, if you want.


  • That was my point of my post. I dont think I articulated it well, but I was trying to say that those expensive quad boxes are not what I had in mind for a Beowulf. I expected a whole bunch of PII machines ... true more bang per server, but more buck, too.

  • by pen ( 7191 )
    Good, now I can get those extra fps in Quake to set me apart from the lamers.

  • I've been meaning to stroll across the street and genuflect before the Cray sometime.. looks like I may have an even better reason to drop by there.

    Maybe I should go back to school... wait, wtf am I saying?!?

  • by Anonymous Coward
    Wouldn't the slowest parts of the box be I/O which we'll throw to the side for now; and then the 100mb intercommunication links?

    I would guess that you are giving it a performance kick by keeping the CPU's close together where they can talk to ram and other CPU's on a medium that's an order of magnitude faster than 100 base network. I'm not thinking that you'd get a 4x increase over single proc boxes, but I'd guess that you'd get more bang for the buck, since you can (if coded for it) keep processes near a group of cpu's and don't have to send info over the slower network. This is where clusters get interesting, how can I keep my part of the program near the closest RAM and CPU's, so I don't have to go over slow links; and still get the full potential of the box.
  • Wow, I bet those would make a really great... oh... never mind. They did.
  • A Quad Xeon, for those of you who don't know, is a Studly Machine. Linus Torvalds has one (at home I think) - when it was delivered the truck guy got confused because his house didn't have a loading bay. It compiles the kernel (heck, it compiles ASS) in under 60 seconds.
    128 of those boxes in one place, in one Beowulf cluster... I think I'm going to have an orgasm.
  • I was just wondering how a modern PC (dual or quad CPU) compares with the Cray 1 (1976 vintage)? I mean, just how fast is todays PC at doing the kinds of things that the Cray was designed for? Can I finally tell people that I've got the power of a Cray by my desk?

    just curious,
  • You're absolutely right; there is no way they got the best bang-for-the-buck in this deal. It's not entirely irrational, though, as there are a couple of other factors besides price that weigh in here. Although 1024 P2-450's are cheaper to purchase, they still need space to put the boxes in, power to run them, network connections, hard drives, someplace to run all those cables and cooling capacity for all those CPUs and RAM and power supplies.

    Also, the quad Xeon's, while not a great deal, are probably not as bad as it looks because each node requires a case, power supply, a minimum of one network interface and probably more for a big cluster, a port or more in the Ethernet switch, and a hard drive. The price tag on all of these does add up, even when you're buying cheap hardware, let alone when it comes from sgi.

    I think a lot (most?) Beowulf clusters do have local storage on each node, and for good reason. The network is usually a bottleneck already, and it costs a lot more than the price of a hard drive to upgrade beyond fast ethernet (per node, that is).
  • Nah they come in up 8 processor varieties, though I think that the 4 is out, and the 8 is soon

  • It is entirely possible that the problem set is a better fit for the xeons and therfore gains significant performace increases (above the standard xeon 3%-5%) with them over the standard PII/PIII. This could be due to:

    1. A piece of code that needs access to >512 megs ram.

    2. A piece of code that performs significatly better with >512k of L2 cache.

    Although I am not a big fan of saying xeon=server there are cases where the xeon solves a problem MUCH better than the standard PIII. I hope that the appropriate research has been done in this case as apposed to the salesman walking in and saying "you need..."
  • Well it looks like SGI is staying in the supercomputer business after all. I would not be supprised if they started selling beowulf clusters as their 'supercomputer offering' in the next year or so. They believe (probably rightly) that the old monolithic idea of the supercomputer is dead and the new supercomputer should be large clusters of workstations. Much cheaper to build a beowulf cluster than a cray.

    SGI seems to be betting large chunk of their business on Linux. Is it a shrewd business move, or done out of desparation? Only the future shal tell. and, btw, at current market capitalization RedHat has become the second largest UNIX vendor (i.e. primary business based on UNIX). They could buy SGI and SCO at this point and still have some left over.
  • I bet they're gonna start selling clustered PC's as supercomputers and leave the new Cray unit to fend for itself...this sounds like a demo model more than anything....

    Who am I?
    Why am here?
    Where is the chocolate?
  • Cray-1: 1 CPU, 80 MFLOPS theoretical peak, about 40-60 MFLOPS on real-world code.

    Cray YMP-8: 8 CPUs, 250 MFLOPS theoretical peak per CPU, about 150-200 MFLOPS on real-world code.

    Cray T94: 4 CPUs, 1800 MFLOPS theoretical peak per CPU, about 450-900 MFLOPS per CPU on real-world code.

    Cray T3E600/LC-136: 136 300MHz DEC Alpha 21164s (8 OS/command + 128 applications), 600 MFLOPS theoretical peak per CPU, about 90-150 MFLOPS per CPU on real-world code.

    SGI/Cray Origin 2000: 32 300MHz MIPS R12ks, 600 MFLOPS theoretical peak per CPU, about 160-200 MFLOPS per CPU on real-world code.

    OSC Mk.1 Beowulf node: 2 400MHz Pentium IIs, 400 MFLOPS theoretical peak per CPU, about 80-100 MFLOPS per CPU on real-world code.

    (Assuming 64-bit floating point throughout; the Intel chips don't suffer as much as you might think from this, as they do all FP internally with 80-bit precision and truncate to 32 or 64 bits.)

    If you've ever wondered why people pay big bucks for Cray vector machines, let me sum it up in three words: sustainable memory bandwidth []. The T90 machines can sustain on the order of 13 GB/s memory bandwidth, and the J90/SV1 machines can sustain about 5 GB/s. By comparison, most workstation and PC systems can sustain about 300-500 MB/s on a good day with a tail wind.

    --Troy Baer

  • What do you guys think an Origin is? Its a cluster with a ragingly-fast interconnect,
    but with shared memory as well. Sgi has been
    going this way for a while. Now that all the
    Sun weenies have had to eat their hats about all
    the shit they've slung about cc:numa, they too
    have cluster-style machines in the wings, and
    are pushing them HARD.

    SMP/S^2MP/MPP and our little beowulfs are
    getting closer together every day. (ever notice
    that myrinet is 100% like the Cray t3e torus?)

    I expect that clusters will continue to grow,
    and companies like whatever sgi becomes will
    push multi-cpu clusters on linux.

  • Several people have commented that OSC might be coming out on the
    short end of the stick on this deal. We disagree, and here's why:

    1. Reliability: The most common hardware failures in cluster systems are hard drives and power supplies. The 1400Ls have redundant power
    supplies, and smaller numbers of nodes will generally have a lower component failure rate.

    2. Migration Path: The "cluster of SMPs" model is in use in several of the largest computers currently in use, including the ASCI Blue Mountain and Blue Pacific machines. Users should be able to develop code on our cluster and then move their code to these much larger platforms with little additional effort.

    3. Application Needs: We have several users with applications which need in excess of a GB of RAM and several GBs of temporary disk storage per node. Many of these applications are "legacy codes" from the vector machines which are difficult to parallelize using message
    passing approaches, but which can be parallelized relatively easily on SMP systems using compiler directives. The architecture we have
    selected allows this as well as multilevel parallel programming, using message passing between nodes and compiler directives within a node.

    4. Flexibility: We have users in virtually all scientific and engineering disciplines, most of whom (50-75%) write their own code. We need a cluster architecture which can accomodate a mix of serial, SMP parallel, and MPP parallel applications.

    There are also drawbacks to this approach, primarily related to memory bandwidth and the added cost for the quad processor nodes.

    Here is a slightly more detailed description of the new OSC Beowulf than was in the press release:

    32 compute nodes plus a front end node, each with
    4 Pentium III Xeon 500MHz processors
    2 GB RAM
    18 GB SCSI-UW disk
    1 Fast Ethernet interface
    2 Myrinet interfaces
    8 16-port Myrinet switches
    various software:
    SGI's modified Red Hat distribution
    PBS queuing system
    Portland Group and KAI compilers
    AMBER (computational chemistry)
    Gaussian 98 (computational chemistry)
    Cactus (computational physics)

    We will be posting further details at [] as things develop.

    --Troy Baer and Doug Johnson, OSC

  • 256 megs quad Xeon 500 (512 kb) 256 megs
    around $14000 using linux
  • But the real question is: does it run Li...

    Oh... never mind...

  • Do they need to hire any programmers?

    We often hire OSU students as programmers, gofers, experimental test subjects (oops, did I say that out loud? :), etc. It's a great place to work. Stop by at the beginning of fall quarter or watch the OSU "green sheets".

  • SGI is playing the game very well to steer clear of trouble. Their sudden move to Linux makes them(and their products) to remain supercomputing arena for years to come. Furthermore, they win a huge abount of good will and publicity from the open source community. Yes the move is extremely cool.
  • Linus has stated that he doesn't want to 'kludge' the kernel to support more memory than 32 bits can address (only in 32-big CPU's, of course). This creates a limit of 2 gigabytes of addressable memory (Not 100% sure here).

    Actually, it's theoretically possible to address up to 4 GB. SGI's "bigmem" [] patch makes it possible to do this, but I think Linus has rejected the patch for the mainstream kernel. That wouldn't shop SGI from shipping kernels built with this patch, so long as they also distribute the source for it.

    Will this limit any of your applications?

    Not really. Our J90/SV1 and Origin system both have 16GB of memory, but I don't think we allow a single job to use more than 2-4 GB of memory. Large files is actually more of a problem than large memory; Gaussian can generate 20+ GB output files. Thankfully support of large files on 32-bit platforms seems to be coming along.

    What was your reasoning behind not choosing a similar solution from an Alpha vendor? (64-bit CPU, much more addresses)

    We have a fair amount of Alpha experience in-house already; several years ago we had a classroom cluster of DEC Alpha workstations, and we currently have a Cray T3E which is also Alpha-based. Our main concern with the Alpha was software availability, especially compilers. The Compaq Digital Fortran beta is a good start, but I would like to see the Portland Group and KAI compilers for Alpha Linux as well.

  • Also, Why put a groovy case on a machine that's going to sit in a darkened server room?

    Our supercomputers sit in a glasshouse in full view. You have to give the public something to look at or the $$$ won't come rolling in anymore. :) I'm really glad our serves aren't ugly gray or beige boxes! You want a very fast computer look the part.

  • > Surely the idea with clusters (and SMP, for that matter) is to use truckloads of cheap stuff to get a better result than one big expensive thing

    Just because a car is cheap doesn't mean it is not going to cost you in terms of repairs and shoddy engineering down the road. What people forget is that price is a function of scale of economy and functionality. Would you base a car purchase purely on the number of cylinders and torque? Similarly the computer purchasing experts look at the overall system, the balance between components, cost of parts, availability of drivers and software, cost of ownership and human learning curve. Personally I think too many people without a clue are hoodwinked by fast talking salesdroids and bells and whistles. If you look at the hard evidence, you might even find that MIPS chips (in the Origins) give better sustained real-world application performance for a certain class of problems than even the highly touted Alphas. Also if you look at say SGI's O2, you find that it is designed for easy rack-mount maintenance. All this comes at a premium.

    Sheesh ... the PC makes a barely adequate car for roaming around the information backlanes but some people want big gruntly trucks for industrial computing. Companies pay real money for smarts who can evaluate the difference between the two.


  • Neat. I wonder if they need extra sysadmins...
  • Prolly . . . I guess they took the bits they needed (NUMA) and sold the rest

    -- Reverend Vryl

  • Heh, yeah. I want one of the new SGI A/C's for the server room at work.
  • You bet? You can not speculate about a company that has shown such on-par marketing techniques up up till now! (um, heh.) By the way, no more cube logo for you kids. Use the "sgi", its for you.

    You know, I wish you folks would come back to reality with this Beowulf crap. It may be a cheap way to weld a bunch of little bargain basement peecees together but no Beowulf abomination is going to do a specific Cray job like a Cray. I am assuming none of you folks that are making these claims have any experience whatsoever in the Science and Engineering disciplines on data mining, visualization, financial forecasting, and aerospace that cray machines are so well suited for. Cray's market share may be shrinking at the moment but that is only due to management geniuses and bandwagon jumpers that think this whole linux thing is going to save their company some how.

    Linux is nowhere near the stability, scaleability, and performance of a polished proprietary unix like UNICOS. I cringe when I see you kids want to install your precious linux on a T3E so it will be "really fast". A T3E is running the best possible operating system for its architecture, as is the T90 and SV1 (know what those are?). No linux flavor is going to do these professional machines justice, let alone make them BETTER.

    Keep it in the basement of 14-25 year old transluscent skinned pseudohacker types living at home because BitchX and Gimp are the only apps those kids are going to run. Oh, and have mom make me an extra grilled cheese sandwich.

    -Patrick Krekelberg
    Institute of Electrical and Electronics Engineers

Experience varies directly with equipment ruined.