Supercomputer Breaks the $100/GFLOPS Barrier

Catch up on stories from the past week (and beyond) at the Slashdot story archive

Supercomputer Breaks the $100/GFLOPS Barrier 281

Posted by CowboyNeal on Saturday August 23, 2003 @10:21AM from the new-kid-in-town dept.

Hank Dietz writes "At the University of Kentucky, KASY0, a Linux cluster of 128+4 AMD Athlon XP 2600+ nodes, achieved 471 GFLOPS on 32-bit HPL. At a cost of less than $39,500, that makes it the first supercomputer to break $100/GFLOPS. It also is the new record holder for POV-Ray 3.5 render speed. The reason this 'Beowulf' is so cost-effective is a new network architecture that achieves high performance using standard hardware: the asymmetric Sparse Flat Neighborhood Network (SFNN)." Because this was a university project, KASY0 was assembled entirely by unversity students, which while being a source of cheap labor, is also a good way to get a lot of students of involved in a great project.

This discussion has been archived. No new comments can be posted.

Supercomputer Breaks the $100/GFLOPS Barrier

Load All Comments

Search 281 Comments Log In/Create an Account

Comments Filter:

Wow! (Score:5, Funny)

by fryguy451 ( 592591 ) writes: on Saturday August 23, 2003 @10:23AM (#6772656)

Imagine a Beowu... errr... Oh..

Share
twitter facebook
Let the Beowulf cluster jokes begin! (Score:5, Funny)

by Anonymous Coward writes: on Saturday August 23, 2003 @10:24AM (#6772660)

Note to moderators, Beowulf cluster jokes CANNOT be offtopic.

Imagine a Beowulf cluster of Beowulf cluster jokes!

Share
twitter facebook
- Re:Let the Beowulf cluster jokes begin! (Score:2)
  
  by kasperd ( 592156 ) writes:
  
  After reading about it on LWN [lwn.net] earlier today, I tried to imagine this beowulf cluster computing how to build a larger beowulf cluster, or just compute how to improve it's own network. (Notice, they used a four node cluster to computer how to wire the network.)
- Re:Let the Beowulf cluster jokes begin! (Score:2)
  
  by blixel ( 158224 ) writes:
  
  Note to moderators, Beowulf cluster jokes CANNOT be offtopic.
  
  How about moderating them down for unoriginality then.
Also I wonder (Score:5, Interesting)

by HanzoSan ( 251665 ) writes: on Saturday August 23, 2003 @10:25AM (#6772662) Homepage Journal

How much electricity will these super computers use up?

All those wires, it looks like it takes up alot of juice.

Share
twitter facebook
- Students as Slave Labor (Score:5, Funny)
  
  by gremlin_591002 ( 548935 ) writes: on Saturday August 23, 2003 @10:29AM (#6772683) Journal
  
  Ponders while there are not University students pictures in the National Geographic Article on Slavery....
  
  Parent Share
  twitter facebook
- Re:Also I wonder (Score:2, Funny)
  
  by jd ( 1658 ) writes:
  
  That depends on how fast the students can operate the peddle-power generators.
  - - - Re:Why do you always call it slave labor? Its not. (Score:2)
        
        by cymen ( 8178 ) writes:
        
        Compare that to the US, where students need to excell at sports, or wage-slave themselves between lectures to keep going.
        
        Stafford Loan Maximums (independent student):
        
        1st year: $6625
        2nd year: $7500
        3rd year: $10500
        4th year: $10500
        5th year: $10500
        
        For dependent students the amount is almost halved but any student can become indepent when they are 23 years old or demonstrate that they truely are no longer supported by their parents/guardians (don't know too much about this though). This is only Stafford loan
      - Re:Why do you always call it slave labor? Its not. (Score:2)
        
        by EastCoastSurfer ( 310758 ) writes:
        
        This is why only the rich can learn in the US.
        
        I agree the rich have it easier, but that is true anywhere in the world. I don't think money is the reason a poor person will likely get less education than a rich person though. If you haven't noticed, scholarships, grants, and loans make it possible for anyone to go to school if they *want* too. The problem is that the drive and discipline to go to school must be instilled by the parents and the family.
        
        To say that rich people are holding down the poor an
      - Re:Why do you always call it slave labor? Its not. (Score:2)
        
        by jd ( 1658 ) writes:
        
        You worked for your education. You worked again to pay for it. If you had student loans at the end, you worked a third time to pay those loans off. Then you worked a fourth time to pay the taxes which pay for all this infrastructure.
        Oh, I agree your situation isn't abnormal. What I find abnormal is paying 400% over the odds (plus interest) and considering it a good deal.
        I'm not talking about a socialist society, where everything is covered by "the state", but rather of any system (regardless of form or
- And how much HEAT? (Score:3, Funny)
  
  by Surak ( 18578 ) * writes:
  
  I mean these things are Athlons! Heck, they're saving money just from the fact that they'll never have to turn on the furnace again!
  
  Did you guys notice from the pics [aggregate.org] that there doesn't seem to be any fans in the holes on the sides? Are they crazy? These are Athlons. I hope they put enough fans in those things.
  - Re:And how much HEAT? (Score:2)
    
    by lederhosen ( 612610 ) writes:
    
    As much heat as power.
  - Point of Trivia (Score:2)
    
    by jd ( 1658 ) writes:
    
    Actually, many very early supercomputers were built into the basement/cellar for this very reason. Pack your computers as low as possible, and use the convection currents to carry the heat around the building.
    (Those familiar with the University of Manchester's Department of Computation, in the UK, will understand what I mean. The architecture is designed around the computer room. Even after the truly massive lumps of iron were removed, it still wasn't until the mid 1990s that the building had a ground-flo
  - Re:And how much HEAT? (Score:2)
    
    by ptbarnett ( 159784 ) writes:
    
    Did you guys notice from the pics [aggregate.org] that there doesn't seem to be any fans in the holes on the sides?
    See here:
    For example, each case came with two side fans, which we converted into a redundant stack venting out the back. [aggregate.org]
  - Re:And how much HEAT? (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    My Athlon XP 1700+ overclocked to 2000+ dissipates approximately 46W as heat. With cooling moving only from the front of the case to the back (though including one pretty fast and loud fan) it reliably stays below 104 degrees Fahrenheit. I'm guessing they're not using overclocking, and they're using the new CPUs with the higher speed bus. ZDNet claims [zdnet.co.uk] that the 2600+ dissipates 62.0 watts as heat, so there's a bit of a bump there, but since I know from experience that Athlon chips can run at 140 degrees with
  - Re:And how much HEAT? (Score:2)
    
    by afidel ( 530433 ) writes:
    
    The Athlon actually has a pretty average Watt/Flop ratio for a modern processor. The only one that really trounces it is the POWER series including POWER-PC. The Athlon 2600+ only uses 68.3W, compare this to a 2.4Ghz P4 which uses 66.2W and you see that they are in the exact same neighborhood. And if you include price into the equation the Athlon becomes the leader. Also if you had RTFA they explain that the side fans were moved to a stacked rear configuration for better airflow and redundency.
- Re:Also I wonder (Score:3, Informative)
  
  by rusty0101 ( 565565 ) writes:
  
  Per the FAQ on the site, the supercomputer draws 210A. Power requirements provide an yearly cost equivalent to the cost of the network equipment connecting the nodes.
  
  210A at 120Vac via the power law comes to 25.2kw/hr. Tripple that to allow for cooling (It takes approx 2 watts of power to remove the heat generated by 1 watt of power usage) and you come to almost 76kw/hr. Take a look at your utility bill to come up with the hourly cost for electricity while this thing is on.
  
  The equipment does not have cool
  - Re:Also I wonder (Score:2)
    
    by jd ( 1658 ) writes:
    
    There was work, at one point, on a single-stage high-voltage amplifier. The idea was to reduce the unwanted distortion by reducing the stages you needed to go through.
    I think these guys need a way to tell if the computer has crashed or lost power. Y'know, UPS' have those mini alarms, but people aren't going to be around the computer all the time, and the UPS will only detect a power outage.
    I think they need a watchdog circuit, linked to a 25.2 kilowatt amplifier and a suitable speaker. That way, no matt
  - Re:Also I wonder (Score:2)
    
    by amorsen ( 7485 ) writes:
    
    First of all your units are all screwed up. 25.2kW was right, forget the /hr thing. Second, I refuse to believe that you need 2W of electricity to move 1W of heat. Airconditioning seems to be more in the range of 1W of electricity needed to move 2W of heat. So let us say 40kW total, which in the silly units used for electricity billing comes to 350MWh/year.
    Anyway, if you want to see stuff that really draws power, go look for the high energy physics stuff. Power cables that are liquid cooled through tubes
  - Re:Also I wonder (Score:2)
    
    by LookSharp ( 3864 ) writes:
    
    210A at 120Vac via the power law comes to 25.2kw/hr. Tripple that to allow for cooling (It takes approx 2 watts of power to remove the heat generated by 1 watt of power usage) and you come to almost 76kw/hr. Take a look at your utility bill to come up with the hourly cost for electricity while this thing is on.
    
    On what planet? I cool my 60 watt or so Athlon XP 2000 using a 4 watt, 80mm fan. Add an 8 watt, 120mm fan on the intake that is WAY overkill, and a 4 watt PS exhaust fan, and I'm using 16 watts to
    - Re:Also I wonder (Score:2)
      
      by WoTG ( 610710 ) writes:
      
      Which is fine for ONE PC. And, if you don't use air conditioning, then yes, a few watts in fans is all the cooling energy cost that you'll need. But try dissipating 25Kw. Your room and then house will heat up really quickly. Let's see, after the Big Blackout I think I saw an estimate that running a household oven is about 12Kw - so this cluster would be comparable to running two ovens 24/7. This heat has to go somewhere... and in most non-residential buildings, it'll be the air con. that has to get rid
- Re:Also I wonder (Score:2)
  
  by b!arg ( 622192 ) writes:
  
  Someone needs to work on their cable management skills too. One word: velcro
- - Re:Also I wonder (Score:2)
    
    by maraist ( 68387 ) * writes:
    
    Well, Athlons typically want 350W.
    There were 128 machines.
    
    Thats 44.8kW
    
    Now take into account at 120V that's 373Amps rms. With all the surge supressors/power-strips, we're talking a serious amount of impedence (fwi, not resistance).
    
    Not to mention, the typical circuit breaker clamps at 20amps. You'd have to have 36 separate circuits in a typical office environment (a circuit usually services several outlets).
    
    To boot, the impedence at such high currents running off the same master power cables could cause
    - Re:Also I wonder (Score:2)
      
      by canajin56 ( 660655 ) writes:
      
      Ummmm....my Athlon 750 system peaks at 150 Watts or so, and that includes the monitor, 2 hard drives, a dvd drive, and a cd burner. Without the monitor, it uses about 80 watts.
      The reason Athlon and P4 system require a 300 watt supply is for when they are starting up.
To those who might not know... (Score:2, Informative)

by qrash ( 63400 ) writes:

gigaflop

As a measure of computer speed, a gigaflop is a billion floating-point operations per second (FLOPS).
- Re:To those who might not know... (Score:2, Informative)
  
  by ant_slayer ( 516684 ) writes:
  
  If you're going to try to be informative, at least be accurate. There's no such thing as a "gigaflop". That would mean "Billions of Floating point Operations Per..." without the unit of time.
  
  It's a gigaflops (singular). The 's' is very important. It's how we know how long it takes to perform a billion floating point operations.
  It's like when people say "I had my engine up to 6000 rpms". What's an rpms? Is it a plural rpm? If so, what is pluralized? The acronym expansion yields "revolutions per
Let's not get too excited.... (Score:5, Funny)

by CGP314 ( 672613 ) writes: <CGP@ColinGregor y P a lmer.net> on Saturday August 23, 2003 @10:25AM (#6772665) Homepage

Supercomputer Breaks the $100/GFLOPS Barrier

Not after you factor in the SCO license fees.

Share
twitter facebook
- Re:Let's not get too excited.... (Score:2)
  
  by Alien Being ( 18488 ) writes:
  
  Imagine a beowulf cluster of subpoenas from Some Cokamamey Outfit.
It's a university project (Score:3, Funny)

by Anonymous Coward writes: on Saturday August 23, 2003 @10:29AM (#6772681)

Remember, everyone, this was a university project. *BSD was also a university project originally, and now *BSD is dying. So obviously university projects are not of very high quality.

Share
twitter facebook
- Re:It's a university project (Score:2)
  
  by eric76 ( 679787 ) writes:
  
  Just wait til we have OpenBSD on an XBox.
Asymmetric Sparse Flat Neighborhood Network (Score:5, Interesting)

by FreeLinux ( 555387 ) writes: on Saturday August 23, 2003 @10:29AM (#6772682)

Obviously, I don't get it. This doesn't look any different than redundant backbones or what is frequently done with VLANs. Multiple paths between hosts is what I see. How is this "new"?

Share
twitter facebook
- Re:Asymmetric Sparse Flat Neighborhood Network (Score:5, Informative)
  
  by flymolo ( 28723 ) writes: <flymolo@NOspAM.gmail.com> on Saturday August 23, 2003 @10:38AM (#6772715)
  
  Due to "creative" (computed) wiring, if all switchs are functioning, no node is more than one hop from each other node. This requires a routing table written for each pc. It could be used for redunancy, but it is being used to minimize latency, and collisions, which are both killers in clusters.
  
  Parent Share
  twitter facebook
  - Re:Asymmetric Sparse Flat Neighborhood Network (Score:3, Interesting)
    
    by FreeLinux ( 555387 ) writes:
    
    no node is more than one hop from each other node. This requires a routing table written for each pc.
    
    Admittedly, I understand that no node is more than one hop away. But, how is this different than all nodes plugged into a large switch like a Cisco 6500 or a Nortel Passport 8600? These switches can have ~128 ports and can switch 256Gbps aggregate throughput at wire speed. Add another switch and then add a second NIC to each host and you increase the capacity even further. Additionally, this does not requi
    - Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)
      
      by Arker ( 91948 ) writes:
      
      But, how is this different than all nodes plugged into a large switch like a Cisco 6500 or a Nortel Passport 8600?
      
      It's cheaper.
    - Re:Asymmetric Sparse Flat Neighborhood Network (Score:3, Informative)
      
      by Rich Dougherty ( 593438 ) writes:
      
      Here's a quote from the site:
      
      Does The World Need Yet Another Network Topology?
      One would think (well, we did ;-) that the latest round of Gb/s network hardware would have made the design of a high-bandwidth cluster network a trivial exercise. However, that isn't the case when the prices are considered:
      
      When we invented FNNs in 2000, the cheapest of the Gb/s NICs available were PCI Ethernet cards priced under $300 each; now they are $50-$100. Prices have continued to drop. Prices on custom high
    - Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)
      
      by sjames ( 1099 ) writes:
      
      The technique that was used seems to be more of a mental exercise in making spaghetti, I don't see it reducing latency or increasing performance beyond the currently used techniques.
      
      It significantly reduces cost. In wire speed switches (FastE or GigE) there will typically be a sweet spot for price/performance. Beyond that point, switch prices jump into the stratosphere.
      For larger clusters, there simply aren't any switches big enough at any price (just try to get a 256 port GigE wire speed switch for e
    - Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)
      
      by jpc ( 33615 ) writes:
      
      (they have 64 machines, not 128, so I have done the numbers with this).
      
      you can increase performance. rather than 1 Gb port into a very expensive 64 port switch, to give you a maximum of 128Gb bandwidth (bidirectional 64x1Gb), you can (if you use the calculator) stick 4 Gb ports in each machine, buy 11 cheapo Dell 24 port gigabit switches (about $3k each), have 1 switch latency, and have 4 times the total non blocking bandwidth available. And the switches will still cost you less than 1 64 port gig switch.
- Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)
  
  by cperciva ( 102828 ) writes:
  
  Traditionally, people have tried to keep their routing tables small. When you're routing in hardware, the larger your routing table, the slower (or more expensive) your routing hardware is. As a result, you want to have single routes which apply to entire groups of hosts (eg, "packets for nodes 0-127 go through port 0, packets for nodes 128-255 go through port 1").
  
  Because the routing is being done in software instead, the cost driver is dramatically reduced; consequently, it becomes cost-effective to hav
  - Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)
    
    by kasperd ( 592156 ) writes:
    
    Because the routing is being done in software instead, the cost driver is dramatically reduced; consequently, it becomes cost-effective to have a routing table with an entry for each node.
    
    I was actually wondering how well Linux would handle this. The obvious algorithm to find the correct entry in the routing algorithm is linear in the number of entries. That doesn't sound like efficient to me, but it might be that 100 entires is still so small a number, that it doesn't matter. However this particular cas
- - From the KAYSO document... (Score:2)
    
    by FreeLinux ( 555387 ) writes:
    
    Note that every PC has at least one single-switch-latency path to every other PC; some PC pairs have more than one such path.
    
    Every host does have at least one pathe to every other host but, most hosts have multiple paths to other hosts. It is true however that all hosts do not necessarily have multiple paths to all other hosts.
this is nice (Score:2, Interesting)

by the_2nd_coming ( 444906 ) writes:

but super computers as in giant iron are becoming more specialized and as such would woop the pants off a Beowulf cluster when competing in the specialty.

of course, if you just need a lot of general purpose super computing, it is obvious that you cannot compete with this.
- Wrong (Score:3, Informative)
  
  by imsabbel ( 611519 ) writes:
  
  In reality, beowolf clusters are good for only a subset of supercomputing tasks and the "real" supercomputers are still best at general purpose supercomputing.
  
  If you can paralize your application well enough, beowoulf rules, but if you need a lot of node2node communication, the network cost quickly surpasses the cpu cost of the system
  - Re:Wrong (Score:4, Insightful)
    
    by sjames ( 1099 ) writes: on Saturday August 23, 2003 @02:23PM (#6773733) Homepage Journal
    
    Really, it's a spectrum. One one end you have fully commodity beowulf, in the middle, you see things like Dolphin and Myrinet, and on the high end you see fully custom backplanes and sometimes RAM and I/O controllers as well. Purpose built CPUs are becomming less common now, but not unheard of.
    
    Each step up the spectrum widens the domain of problems that the machine can work on efficiently, and raises the price for the machine. In many cases, a 'real' supercomputer is more or less a cluster with a specialized network and OS and mounted in a single cabinet so it doesn't look like a cluster.
    
    In general when a lower end machine can efficiently run your program, there is no benefit to using a more expensive machine.
    
    As server hardware improves and 'exotic' hardware becomes more mainstream, the gap between the low and the high end narrows. There will probably always be a small but existant set of problems that call for the 'real' supercomputer, but that set is shrinking.
    
    There are other considerations as well. If the Beowulf in your lab can solve the problem in 1 week and is available now, while the 'real' supercomputer on the other campus can solve it in 4 hours and will have a timeslot available in 2 weeks, the Beowulf is 'faster' from your point of view.
    
    Parent Share
    twitter facebook
Playstation2 at 5.5GFLOPS costs only $199 $40/GFL (Score:4, Insightful)

by gorim ( 700913 ) writes: on Saturday August 23, 2003 @10:31AM (#6772691)

And it was introduced to consumers just a couple years
ago. Sorry, the AMD beowulf cluster at $100/GFLOP just
isn't that impressive.

Share
twitter facebook
- Is that a real number or a marketing number? (Score:5, Insightful)
  
  by Sycraft-fu ( 314770 ) writes: on Saturday August 23, 2003 @10:48AM (#6772764)
  
  I'm guessing the latter. You see all sorts of BSified numbers from marketing departments on processors, but they have little to do with reality. The number for this AMD cluster is a real, actual, measured-using-a-real-world-app number. To give you some idea of BS console numbers, the Xbox has a PIII 733 processor in it (ok, technically it's a little different, but it's a P3 core). Now the Gflop claim is 2.93. Out of a P3 733? Ya right, on paper perhaps but never in the real world, much less on a real app.
  
  Then, of course, there is the issue of specialised chips vs normal chips. A GeForce 4 4400 can claim, roughly, 80 Gflops peak. That sure beats the hell out of any sinlge CPU I've ever heard of, including the Power4. Thing is the GeForce 4 is a graphics DSP, it isn't a general purpose CPU. It can do that kind of math when all its units are working at what they do best, but try to reprogram it to do something else and it will slow to a crawl (for that matter I'm not even sure that it is turing complete).
  
  So don't take any hype on a console to equate to real performance in a general task. Oh, and the BS marketing number I see for the PS2's Emotion Engine is 6.2Gflops.
  
  Parent Share
  twitter facebook
  - Re:Is that a real number or a marketing number? (Score:3, Insightful)
    
    by drinkypoo ( 153816 ) writes:
    
    Of course, assuming it's only half the parent comment's assertion, thus 2.25 GFlops, at $180 it's still cheaper than $100/GFlop. However, as others (should?) have pointed out by now, it's useless as a supercomputing node for all but the smallest tasks since it has no local storage and extremely limited main memory. You will have to spend another $200 for a linux kit to get storage and networking, bringing it up to $380 for the system. If it were actually 5.5 GFlops in the real world, then that would still b
- Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)
  
  by Surak ( 18578 ) * writes:
  
  5.5 Gflops, I dunno if it can really do that, but ...uh..the point is that it's the first *supercomputer* to break the $100/GFLOP barrier. The Playstation2, last I checked, isn't a supercomputer, it's a videogame platform.
- In cache maybe (Score:3, Informative)
  
  by msgmonkey ( 599753 ) writes:
  
  These numbers for microprocessors etc mean nothing because they are usually referring to operations on data in cache.. you'ill find that real life performance is 10-20x slower because thats how much slower accessing main memory is.
- Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)
  
  by alienw ( 585907 ) writes:
  
  Nice how you take the numbers from a marketing press release and treat them as if they are the absolute, indisputable truth. Can you show me the actual, reproducible benchmark that produced those numbers?
  
  Also, the PS2 is not a supercomputer. It has a slow processor and very little RAM, so it wouldn't be able to do much number-crunching. You can't hook PS2s together, anyway, so comparing a single specialized machine to a cluster is absolutely meaningless.
  - - Re:Can't hook ps2's together? (Score:3, Interesting)
      
      by alienw ( 585907 ) writes:
      
      Good luck getting a beowulf cluster with that crap. Ethernet is not a good interconnect technology. It's not even a good networking technology. And interconnect technology is the main performance-determining factor with a beowulf cluster.
      
      Anyway, if you think you can do better with PS2s, why don't you do so?
- Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)
  
  by Enonu ( 129798 ) writes:
  
  The key word here is supercomputer. A PS2 is not a supercomputer.
- This beat the PS/2 (Score:2)
  
  by Arker ( 91948 ) writes:
  
  The previous price/performance champ was in fact a PS/2 cluster, mentioned here, but this AMD cluster is roughly three times the performance for the dollar. You can check the stats with different assumptions on their FAQ [aggregate.org] page, particularly the section labeled 'Is KASY0 really the first supercomputer under $100/GFLOPS?'
- Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:3, Informative)
  
  by Arker ( 91948 ) writes:
  
  Gah feel free to mod the previous version of this comment into oblivion, I hit submit accidentally.
  
  The numbers you're looking at are marketing numbers first off, and overly generous. Second you don't scale for free - you never get anything like 100 times the performance of a single box when you wire 100 together, for the same reason that you don't get twice the horsepower out of an engine twice the size.
  The previous price/performance champ was in fact a PS/2 cluster, mentioned here [com.com], but this AMD cluster
- - Re:Please mod parent down (Score:2, Informative)
    
    by gorim ( 700913 ) writes:
    
    A playstation2 costs $199. That information is in your local newspaper. Actually, sales peg it at $179 lately, my mistake. The playstation2, with 2 vector processing units, each with 4 floats wide registers (128bit), capable of doing a multiply-add operation per clock cycle on whole registers, at 300mhz independant of the main CPU which still has its own scalar floating point coproc, handily does 5.5GFLOPS, and is well documented as such if you google around. Check out http://playstation2-linux.com/
The burning question (Score:2, Funny)

by Timesprout ( 579035 ) writes:

though is how many mp3's are these students sharing on this monster ?
- Re:The burning question (Score:3, Insightful)
  
  by HanzoSan ( 251665 ) writes:
  
  People dont share mp3s anymore, if they do the FBI, NSA, Secret Service, CIA, and Homeland Security Dep will swarm them and put them in the bay.
  
  I mean I wish we could crack down like this on organized crime, or on domestic terrorists, I'm surprised we are so aggressive at arresting teenagers who download music, but the KKK and Neo Nazis can collect a million guns and spread their crazy hate speech and its protected by freedom of speech.
  
  I'd think that hate speech does more harm than copyright infringement
hot damn, they're case modders! (Score:2, Funny)

by mrgreenfur ( 685860 ) writes:

each node has two side case fans! that's gotta be the most dedicated case modding job i've ever seen! 132 pc's with 2 fans! too bad they didn't put fan guards ... or interior lights.. or blue led's... but i guess all that junk about a supercomputer makes up for it...
- Re:hot damn, they're case modders! (Score:2)
  
  by ptbarnett ( 159784 ) writes:
  
  No, the case came that way. But, you will notice that they are stacked next to each other, blocking the side ports for all except the ones on the left end.
  That's probably why they did this:
  For example, each case came with two side fans, which we converted into a redundant stack venting out the back. [aggregate.org]
- Re:hot damn, they're case modders! (Score:2)
  
  by commodoresloat ( 172735 ) writes:
  
  yeah man where are all the blinkenlights?
So much power... (Score:5, Funny)

by krahd ( 106540 ) writes: on Saturday August 23, 2003 @10:41AM (#6772730) Homepage Journal

and it still can't run Doom III at a decent rate.

--krahd

mod me up, scottie!

Share
twitter facebook
Comment removed (Score:4, Interesting)

by account_deleted ( 4530225 ) writes: on Saturday August 23, 2003 @10:44AM (#6772750)

Comment removed based on user account deletion

Share
twitter facebook
- - Re: (Score:3, Informative)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
Cooling (Score:4, Informative)

by bengoerz ( 581218 ) writes: on Saturday August 23, 2003 @10:45AM (#6772756)

I toured the previous cluster these guys did (KLAT2) and was very impressed. However, using AMD Athlon Thunderbirds last time, it did get quite hot. I remember standing by the cluster looking at all the wiring and being bombarded by an overhead cooling vent. I'm also assuming that these cooling issues is the reason that each case has two blow-holes. I'd also like to see these guys post in-depth specs of each machine. Being a hardware nut, I'd like to see how they got so many machines so cheap, and maybe even what vender they used. As I remember, they worked REALLY hard on their last cluster to keep costs to an absolute minimum.

Share
twitter facebook
Here is the bill! (Score:5, Funny)

by borgdows ( 599861 ) writes: on Saturday August 23, 2003 @10:50AM (#6772776)

Dear customer,

At the cheap introductory price of 699$ for 80 lines of code in the Linux kernel, it will cost you 8,377,500$ by kernel since we have discovered that in fact 1000000 lines of SCO IP were copied into Linux.

Designation .. Price .. Qty .. Total
Linux kernel .. 8,377,500$ .. 128 .. 1,118,400,000$

So you must pay us only 1,118,400,000$, and in my kind almighty I will offer you a discount of 118,400,000$ so you only have to pay ONE BILLION DOLLAR if you pay before tomorrow!

Please send you creditcard number at darl@sco.com

Sincerely yours,

-- Darl Mac Bride

Share
twitter facebook
- - Re:Here is the bill! (Score:2)
    
    by drooling-dog ( 189103 ) writes:
    
    You have lawyers?
Nice wiring! (Score:2, Insightful)

by nate.sammons ( 22484 ) writes:

Looks like most of the wiring jobs I've seen done by students: kasy0core.jpg [aggregate.org].

God forbid they use cable gutters ;-)

Other than that, kick ass job guys!

-nate
- Re:Nice wiring! (Score:2)
  
  by KingDaveRa ( 620784 ) writes:
  
  And what ever happened to colour-coordination? Nothing matches!
Way to go! (Score:2)

by panda ( 10044 ) writes:

Hey! I used to work there.

Way to go Dr. Dietz!

So, mod me anyway you want, karma to burn.
How many university have larger clusters? (Score:5, Interesting)

by SilverSun ( 114725 ) writes: on Saturday August 23, 2003 @11:04AM (#6772823) Homepage

I wonder which universities/institutes have larger and maybe cheaper clusters, but just don't bother with running benchmarks. I for one are sitting next next to a tiny cluster with 40 dual-cpu nodes, which is connected (GRID like) to a 340 dual-node cluster in a nearby town. Non of us high ernergy physicists bothers with running any benchmarks on our clusters, other than our own applications. I wonder how many "linux-cluster-supercomputers" are out there which would easyly make it into the top 500, but noone has ever heard of....

Cheers.

Share
twitter facebook
- Re:How many university have larger clusters? (Score:4, Funny)
  
  by Chilles ( 79797 ) writes: on Saturday August 23, 2003 @11:28AM (#6772893)
  
  I wonder how many "linux-cluster-supercomputers" are out there which would easyly make it into the top 500, but noone has ever heard of....
  
  Well... probably more than one, definitely no more than 500.
  
  Parent Share
  twitter facebook
University students (Score:5, Insightful)

by SuperBanana ( 662181 ) writes: on Saturday August 23, 2003 @11:07AM (#6772834)

Because this was a university project, KASY0 was assembled entirely by unversity students, which while being a source of cheap labor, is also a good way to get a lot of students of involved in a great project.
At the risk of being flamebait- No. Using university students is almost always purely a way of getting cheap labor to do semi-mindless, or completely mindless, stuff the staff doesn't want to do- it's a common myth that students 'learn' by doing grunt work. I should know- I have several grad student friends, and they've thusfar spent a large part of their academic careers working in labs doing mind-numbingly boring stuff(according to them.)
Imagine if a Bio lab did this. The following would sound pretty absurd: "Help us move our lab, you'll learn about cellular recombination!". No. You'll learn what a bunch of lab equipment looks like, how eccentric the professors are, and how expensive/fragile/heavy the equipment is, and the next morning what sore muscles are like. Let's get a reality check here.
(from the site):Our group develops the systems technology for cluster supercomputing; the more people we can show how to apply these technologies, the better.
Huh? What cluster supercomputing "technology" does assembling a PC and plugging it into ethernet teach you? Did they give a presentation about how clustering technology works, for example? Did they explain to each person, as they put a machine in a particular place and wired it to a particular switch, WHY it was going there etc? Obviously I wasn't there, so perhaps someone from the group can contribute on this point.

Share
twitter facebook
- Re:University students (Score:5, Informative)
  
  by panda ( 10044 ) writes: on Saturday August 23, 2003 @11:46AM (#6772963) Homepage Journal
  
  Having worked there, and knowing what Hank Dietz and his students are doing, I can tell you that it is different from just slapping PCs together, stringing wire between them and installing clustering software.
  
  Dietz specializes in networking and all the wiring that you see in the photos is charted out by custom software that he's written just for this purpose.
  
  He works in the realm of optimizing communications among the nodes to avoid network latency and so on. If you read the POVRay benchmarks, you'll notice that the author comments that several clusters' CPUs spend most of their time idle due to network latency. Dietz is researching the best ways to eliminate much of that latency so that the CPUs in the cluster can spend more of their time crunching data rather than just throwing off heat. To my knowledge, he is succeeding at this and better than most other researchers in the field.
  
  As for what his students learned from this, I don't know exactly which students helped him on this. For KLAT2, there were several undergrad volunteers who helped with wiring and assembly, mostly from the campus Linux Users' Group. I know his grad students and research assistants are learning a lot about how clustering and network tech works, and a couple are doing their Ph.D. disserts in this very subfield of E.E.
  
  Parent Share
  twitter facebook
  - Re:University students (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    It seems to me that the next step is to get some big switches with VLAN support and reconfigure them dynamically as the workload changes in order to maximally utilize all nodes. I wrote some pathetic little software once to log into some switches and make vlan changes from a web interface (no security or anything, what a bad idea eh? worked though. this was before cisco included the "why bother" ssh1 in ios) so at least THAT part is trivial :D
  - Re:University students (Score:2)
    
    by rapett0 ( 92674 ) writes:
    
    I worked there as well back during the KLAT-2 days. Sure I dont' remember getting any actual mention on the project (even if I did help coordinate the student help, help build it, assemble it, etc), but thats ok. It was a great project to work on (I still even have the GaLUGtica videos I made in 3dsmax) and I did learn quite a few things. Deitz harpy of darkness, aka Tim Mattox, knows his stuff, and him along with Dieter will very helpful to the students answering whatever questions they had.
    
    Petty 1 of
- Re:University students (Score:2)
  
  by Kethinov ( 636034 ) writes:
  
  At the risk of being flamebait- No.
  
  He who moderates you has been infected with the reverse psychology bug! See sig.
- Re:University students (Score:2)
  
  by Skuld-Chan ( 302449 ) writes:
  
  On the other hand to build a supercomputer for less than 89$ per gflop you still have to actually "build it". I mean who else is going to put it all together? Someone has to build it if you want cs students to use it.
  
  And yes microbiology students will still have to build their own apparatus for experiments they conduct - I only know this because I took a class in microbiology a while back and I had to build the apparatus for all the required experiments I had to do.
  
  I'm guessing in this case they not only
In other news... (Score:2, Insightful)

by rmdyer ( 267137 ) writes:

Now that the university students have graduated and moved on, there isn't any documentation, nor do they know how to use the darn thing...

-1
why not DSP? (Score:5, Interesting)

by mike_g ( 24445 ) writes: on Saturday August 23, 2003 @11:11AM (#6772842) Homepage

Why are not DSPs used in configurations such as this. The TI 67xx series are able to perform about 1 GFLOP/s running at only 150 MHz and cost only about $40 per chip.
This price/performance ratio seems to make them very attractive compared to general purpose CPUs. According to the NASA G5 Study [cox.net], the P4 2.66 GHz is only able to achieve 255 MFLOP/s. And the P4 costs about 4x the price of the 6711 DSP.
It seems that DSPs should be the clear winner in supercomputer applications, what are their disadvantages and why are they not used? Granted there is a lack of mass produced hardware such as motherboards for DSPs, but that alone should not exclude them from the supercomputer realm.

Share
twitter facebook
- Re:why not DSP? (Score:2)
  
  by panurge ( 573432 ) writes:
  
  I guess it's the lack of operating system support on the DSPs themselves. Plus their instructions sets and I/O don't lend themselves well to general purpose computing. The cost of developing a node consisting of a DSP plus a general purpose processor, plus the efficient I/O to the DSP, might be too high for the relatively restricted usage on supercomputers.
  That said, my Palm Tungsten is a combo of a GP processor and a DSP, as I believe are several Sony variants. Perhaps as I/O on handhelds improves (?) the
- Actually... (Score:2)
  
  by jd ( 1658 ) writes:
  
  There's a lot of OS essentials that could be moved easily into hardware. By using programmable gate arrays, or by just etching the kernel directly onto silicon, they should be able to reduce the energy requirements and thereby the actual cost.
  
  Further, it would also accelerate the product enormously - Linux on a Chip would be blazingly fast, as it wouldn't take any processing power away from what it was running - thereby also reducing the cost per GFLOP.
- Re:why not DSP? (Score:4, Informative)
  
  by SmackCrackandPot ( 641205 ) writes: on Saturday August 23, 2003 @12:36PM (#6773200)
  
  Actually, they do, but they are referred to as vector processors rather than DSP's. Probably the most famous and the first was the Cray supercomputer [cray-cyber.org]. And there was also the INMOS "Transputer" [ox.ac.uk]
  
  DSP's are optimised to handle streamed data of a particular maximum size (Eg. 4-element float point variables). Useful for image processing (red,green,blue,alpha) and 3D graphics(XYZW), but if you're modelling something like ocean currents, global weather, every data element is more than likely going to have more than four variables (eg. temperature, humidity, velocity, pressure, salinity, ground temperature), you may not get full optimisation.
  
  Plus, you also need a means of getting all these processors to talk to each other. DSP's are nearly always optimised to operate in single pipelines, so don't need much communication support (eg. Sony Playstation 2). However, if you're designing a supercomputer system, the major bottleneck is the communication between processors (network topology). Some applications might only need adjacent processors to talk to each other (global weather simulation usually represents the atmosphere as a single large block of air, with sub-blocks assigned to seperate processors. Other applications might assign individual processors to different tasks, which complete at different rates (eg. the Mandelbrot set). A configurable network architecture allows the system to be used for many more different applications.
  
  Parent Share
  twitter facebook
  - Re:why not DSP? (Score:2)
    
    by AtrN ( 87501 ) writes:
    
    The transputer was not a vector machine, not unless you count the move2d instruction :)
- Re:why not DSP? (Score:2)
  
  by sigwinch ( 115375 ) writes:
  
  Why are not DSPs used in configurations such as this.
  
  1. Non-commodity hardware has high one-time expenses for design.
  2. DSPs tend to not have a lot of RAM, whilst big modelling apps crave RAM (esp. raytracing).
- - Re:why not DSP? (Score:2)
    
    by mike_g ( 24445 ) writes:
    
    Maybe because DSP is designed for digital signal processing, not to handle branches and stuff that ordinary CPU does.
    True, I realize this. But I am under the impression that a lot of heavy duty number crunching algorithms have a minimum number of branches, and mostly just perform the same operations on multiple sets of data. Think of the FFT and simulations of systems based on differential equation models. This should include weather models and quite possible nuclear events. These applications seem li
Mckenzie Cluster, faster, cheaper per TFlop (Score:5, Interesting)

by prof_bart ( 637876 ) writes: on Saturday August 23, 2003 @11:33AM (#6772910)

Hmmm...
Nice machine, but this January, CITA and the astro department at the University of Toronto brought a 256 node dual Xenon system on line: "1.2 trillion floating point mathematical operations per second (Tflops) on the standard LINPACK linear algebra benchmark." Total cost: CDN$900K (including tax) (in January prices, that's $600K U.S. or $0.50USD/GFlop.) It's being used for some very cool Astro simulations...
See http://www.cita.utoronto.ca/webpages/mckenzie

Share
twitter facebook
- Re:Mckenzie Cluster, faster, cheaper per TFlop (Score:3, Informative)
  
  by afidel ( 530433 ) writes:
  
  $600,000 dollars / 1,200 Gflops = $500/Gflop. I think you misplaced a decimal in the $Cad->$US conversion =)
How are these booted? (Score:2)

by cmason ( 53054 ) writes:

Am I missing something? They say:
KASY0 nodes are completely diskless; there isn't even a floppy. (from the FAQ [aggregate.org])
So how are the nodes booted? Are there bioses out there that can netboot?
-c
- Re:How are these booted? (Score:2)
  
  by digitalhermit ( 113459 ) writes:
  
  Many ethernet cards have a socket for a programmable chip that allows netbooting. Pretty much all you need is the address of the server from where to retrieve the rest of the software. Usually the kernel is loaded via tftp then the rest of the os is NFS mounted. I don't know if this is how the article is doing it, but the netboot stuff is pretty common and easy to configure.
- Re:How are these booted? (Score:2)
  
  by afidel ( 530433 ) writes:
  
  Yep, it's called IBA or Intel Boot Agent, it allows booting of a diskless system through PXE. It's actually where Paladium came from origionally. In order to pull a boot image over the network and be sure it was not tampered with on the wire through a man in the middle attack you need hardware crypto with a signed boot image. Basically every PC made in the last 4-5 years or so supports it (there are exceptions but they are usually consumer only oriented only PC's, corporate PC's almost all support it). It's
There is Flop and Flop (Score:4, Insightful)

by Tiosman ( 614633 ) writes: on Saturday August 23, 2003 @12:14PM (#6773086)

It's not the first time that these folks in KY work around the definition of the acronym "Flop". A Flop is a floating point operation on 64 bits, not 32 bits. All entries in the Top500 used results with 64 bits HPL, nobody else in the world is running HPL on 32 bits. So claiming the moon on 32 bits is easy, useless for the sake of comparaison and almost unethical. I cannot believe that Dr Dietz do not know the difference by now.

The same machine would yield average results on 64 bits. Difficult to draw attention without headline numbers...

Share
twitter facebook
And next week ... (Score:2)

by SmackCrackandPot ( 641205 ) writes:

... they're going to have the largest Quake LAN party ever!
overclocking (Score:2, Insightful)

by snooo53 ( 663796 ) writes:

Looking at the specs I'm curious if anyone thought of overclocking the machines to get an even bigger performance increase. It seems that with most Athlons you can get at least a good 100 mhz of extra speed, even with a stock cooler, by increasing the fsb/multiplier and not even touching the voltage. Even a modest increase like that would yield an extra 12.8GHz of power, dropping that price figure even further. Depending on what type of computing they're doing, increasing the fsb might have an even bigg
- Way to miss the whole point (Score:2)
  
  by pclminion ( 145572 ) writes:
  
  That would be stupid. The entire point is, the nodes are so freaking cheap, that if you really want an extra 5% performance you just buy a few more nodes. Gee, what do I choose, buy a few more nodes, or spend two weeks overclocking all these finicky chips and trying to get them to run correctly?
  Besides, nobody in their right mind would run a parallel program of any importance on a "rigged" setup like that.
  - Re:Way to miss the whole point (Score:2)
    
    by toddestan ( 632714 ) writes:
    
    Not to mention if you can only overclock, say, 50% of them, would you run into problems with nodes running at different speeds?
    - Re:Way to miss the whole point (Score:3, Interesting)
      
      by pclminion ( 145572 ) writes:
      
      It depends what sort of cluster it is. If you have a standard network of workstations, and you're running something like PVM or MPI, then each node can run at a different speed. In fact, they don't even have to be the same kind of nodes (you can have different platforms, say Solaris and Linux, both running in the same virtual parallel machine).Usually you will have to adjust your algorithms to account for nodes running at different speeds. But it doesn't make it impossible.
      MOSIX is a parallel cluster oper
Pardon me Cowboy (Score:2)

by Stonent1 ( 594886 ) writes:

But the "FA" says $1000 per gflop not $100

Did you RTFA?
What about $170K (Score:4, Funny)

by Axe ( 11122 ) writes: on Saturday August 23, 2003 @02:20PM (#6773722)

That they own to SCO, that damn commies? Did they at least aknowledge using stolen property?
What a shame. Freeloaders. They would never be able to achieve such performance if not for the fruits of labour of SCO .. eeeh.. lawers?

Share
twitter facebook
$100/GFLOP (Score:2)

by exp(pi*sqrt(163)) ( 613870 ) writes:

Er...you can do that with parts from ebay or craigslist without too much trouble.
university price calcuations are bogus (Score:2)

by peter303 ( 12292 ) writes:

You have to include a people time, building overhead etc. A reserach grant may be billing $500 - $1000 a day for this. If this takes 50 man days to set up, then the cost is is another $50,000.
- Re:Hmm Math? (Score:2, Informative)
  
  by r00zky ( 622648 ) writes:
  
  128+4...
  That's like 132 isn't it?
  
  From the FAQ:
  
  KASY0's configuration is:
  128 + 4 "cold spare" PC nodes, each containing:
  One AMD Athlon XP 2600+ (the 2.075GHz version)
  One 512MB PC2700 DDR SDRAM
  BioStar M7VIT Pro motherboard
  Two Linksys LNE100TX NICs
  Codegen 6042L case with 400W power supply
  18 BenQ SE0024 24-port Fast Ethernet switches
  405 Cat5 Fast Ethernet cables
  RedHat Linux 9.0, modified Warewulf 1.11
  
  So it's 128, the other 4 are spares!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Wow! (Score:5, Funny)

Let the Beowulf cluster jokes begin! (Score:5, Funny)

Re:Let the Beowulf cluster jokes begin! (Score:2)

Re:Let the Beowulf cluster jokes begin! (Score:2)

Also I wonder (Score:5, Interesting)

Students as Slave Labor (Score:5, Funny)

Re:Also I wonder (Score:2, Funny)

Re:Why do you always call it slave labor? Its not. (Score:2)

Re:Why do you always call it slave labor? Its not. (Score:2)

Re:Why do you always call it slave labor? Its not. (Score:2)

And how much HEAT? (Score:3, Funny)

Re:And how much HEAT? (Score:2)

Point of Trivia (Score:2)

Re:And how much HEAT? (Score:2)

Re:And how much HEAT? (Score:2)

Re:And how much HEAT? (Score:2)

Re:Also I wonder (Score:3, Informative)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

Re:Also I wonder (Score:2)

To those who might not know... (Score:2, Informative)

Re:To those who might not know... (Score:2, Informative)

Let's not get too excited.... (Score:5, Funny)

Re:Let's not get too excited.... (Score:2)

It's a university project (Score:3, Funny)

Re:It's a university project (Score:2)

Asymmetric Sparse Flat Neighborhood Network (Score:5, Interesting)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:5, Informative)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:3, Interesting)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:3, Informative)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)

Re:Asymmetric Sparse Flat Neighborhood Network (Score:2)

From the KAYSO document... (Score:2)

this is nice (Score:2, Interesting)

Wrong (Score:3, Informative)

Re:Wrong (Score:4, Insightful)

Playstation2 at 5.5GFLOPS costs only $199 $40/GFL (Score:4, Insightful)

Is that a real number or a marketing number? (Score:5, Insightful)

Re:Is that a real number or a marketing number? (Score:3, Insightful)

Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)

In cache maybe (Score:3, Informative)

Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)

Re:Can't hook ps2's together? (Score:3, Interesting)

Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:2)

This beat the PS/2 (Score:2)

Re:Playstation2 at 5.5GFLOPS costs only $199 $40/G (Score:3, Informative)

Re:Please mod parent down (Score:2, Informative)

The burning question (Score:2, Funny)

Re:The burning question (Score:3, Insightful)

hot damn, they're case modders! (Score:2, Funny)

Re:hot damn, they're case modders! (Score:2)

Re:hot damn, they're case modders! (Score:2)

So much power... (Score:5, Funny)

Comment removed (Score:4, Interesting)

Re: (Score:3, Informative)

Cooling (Score:4, Informative)

Here is the bill! (Score:5, Funny)

Re:Here is the bill! (Score:2)

Nice wiring! (Score:2, Insightful)

Re:Nice wiring! (Score:2)

Way to go! (Score:2)

How many university have larger clusters? (Score:5, Interesting)

Re:How many university have larger clusters? (Score:4, Funny)

University students (Score:5, Insightful)

Re:University students (Score:5, Informative)

Re:University students (Score:2)

Re:University students (Score:2)

Re:University students (Score:2)

Re:University students (Score:2)

In other news... (Score:2, Insightful)

why not DSP? (Score:5, Interesting)

Re:why not DSP? (Score:2)

Actually... (Score:2)