
Cray XT-3 Ships 260
anzha writes "Cray's XT-3 has shipped. Using AMD's Opteron processor, it scales to a total of 30,580 CPUs. The starting price is $2 million for a 200 processor system. One of its strongest advantages over the std linux cluster is that it has an excellent interconnect built by Cray. Sandia National Labs and Oak Ridge National Labs are among the very first customers. Read more here."
imagine a... (Score:5, Funny)
Re:imagine a... (Score:2, Interesting)
Re:imagine a... (Score:5, Insightful)
When you have a single CPU, designing the system to be pretty fast is easy. There's no major contention to deal with.
Two CPUs? Slightly harder, but reasonably straightforward. You don't see a 2x improvement in speed over one CPU, but it's around 1.95x, give or take a bit.
Four CPUs? Now you're starting to see less improvement ... probably around 3.2x, because of all the contention issues.
Sixty-four CPUs? You'll be lucky to get a 50x speed up over a single CPU.
When you get to 200 CPUs, the issue of access to shared memory and other shared resources becomes critically important. It's also an issue that most computer buyers don't need to worry about, because they don't have 200 CPUs in their system. This means that you have a lot of highly specialised research going on, and relatively few buyers to spread the cost of that research over.
Two million for a 200 CPU box which has low latency, low contention, and solid reliability is not a lot at all. You might not buy it. That doesn't mean nobody will.
Yeah, it's gotta be awful (Score:2, Funny)
Next time, I'm just gonna build a beowulf cluster out of 200 overclocked AMD Barton 2500s. I shall NOT be suckered again!
Re:Yeah, it's gotta be awful (Score:3, Funny)
Re:imagine a... (Score:5, Informative)
The upper bound on speedup is generally Amdahl's law [wikipedia.org]. Plainly, the efficiency approaches zero as the number of processes is increased. Generally we consider the major sources of overhead to be communication, idle time, and extra computation. Interprocess communication is considered negligible for serial programs in this context (we consider message passing). Idle time ends up contributing to overhead, because processes idle awaiting information from others. Extra computation is virtually unavoidable at some point; for instance in MPI's Single Program Multiple Data model, each process in tree-structured communication other than the root is eventually idled prior to the completion of computation, and each process determines IPC at some point based on rank.
There are notable exceptions to Amdahl's law, however; Gustafson, Montry and Benner wrote about such in Development of parallel methods for a 1024-processor hypercube, SIAM Journal on Scientific and Statistical Computing 9(4):609-638, 1988.
Re:imagine a... (Score:2, Informative)
1 CPU @ 1.00x -> 1.00 / 1 = 1.000
2 CPUs @ 1.95x -> 1.95 / 2 = 0.975
4 CPUs @ 3.20x -> 3.20 / 4 = 0.800
64 CPUs @ 50.0x -> 50.0 / 64 = 0.783
Pop that into an OpenOffice.org spreadsheet and look at the graph.
That is not linear, in fact, it's non-linear in the direction that *helps* more and more processors. If the decline from 4 CPUs to 64 CPUs is a mere 1.7% efficienc
Re: (Score:2, Funny)
Re:My new dream toy (Score:4, Funny)
How big is it? (Score:3, Interesting)
The article doesn't appear to mention its dimensions, and I'm curious to know what kind of space you need to install this baby. Anyone got any idea?
Re:How big is it? (Score:4, Informative)
Weight (maximum): 1529 lbs per cabinet (694 kg)
http://www.cray.com/products/xt3/specifications
Re:How big is it? (Score:2, Funny)
Dimensions (cabinet): H 80.50 in. (2045 mm)
Wow... for the first time in my life, I couldn't picture 80 inches, but I could 2 meters. I think there may be hope in the metric system afterall.
Re:How big is it? (Score:2)
How big it is (Score:2, Informative)
Dimensions (cabinet):
H 80.50 in. (2045 mm) x W 22.50 in. (572 mm) x D 56.75 in. (1441 mm)
Sorry to reply twice but I forgot this detail.
Re:How big is it? (Score:2)
If you're paying 2 million and upwards for a thing like this you probably can afford an appropriate space with appropriate climate control.
(OK, so some people cram ${car_price * 10} worth of HIFI equipment into a ${small_japanese_car}, but I doubt anyone would want one of these installed in their closet...)
You can get to their website? (Score:2)
I'll pass for now. (Score:4, Funny)
Re:I'll pass for now. (Score:2, Funny)
we're getting closer... (Score:5, Funny)
Re:we're getting closer... (Score:3, Funny)
Ahh, now that's what I call an optimist.
Re:we're getting closer... (Score:3, Funny)
A few more years of computer advances and this joke will still be modded funny!
Re:we're getting closer... (Score:3, Insightful)
It was funny like a year ago. Now it's as overused as an SNL skit.
Re:we're getting closer... (Score:5, Insightful)
Back in my day we spelled "enuff" without the 'f' character and it was good enough for us.
$2 million for a computer? (Score:3, Funny)
I can't believe people complain about the price of iMacs....
Re:$2 million for a computer? (Score:3, Funny)
Please.. it doesn't have any.. it just *knows* what you want to do before you *know* what you want to do..
real FPU operations (Score:5, Interesting)
I ask, because I remember that the Athlons beat the pants off the Pentium 4's in FPU operations, so all the benchmarks were rewritten to use SSE2.
Re:real FPU operations (Score:5, Informative)
Re:real FPU operations (Score:2)
Re:real FPU operations (Score:5, Interesting)
1) You can use it in scalar mode, in which case it's almost like x87, only a bit faster because:
a) It doesn't use a braindead register model (stack)
b) On P4, you can do a mul and an add in parallel with SSE, but not with x87
2) You can use SSE intrinsics. It's not as easy as "normal" programming, but easier than assembly and almost the same speed.
3) Unaligned access is possible. It's slower than aligned access, but overall better than non-vectorized code.
4) Trig is so slow that SSE/x87 doesn't matter (unless you write approximations, in which case SSE will also be faster).
No one does trig with x87 anyway. (Score:2)
The only time you should be using fsincos (SLOW) is when you need to build a table or populate variables accurately before a loop.
Re:real FPU operations (Score:4, Insightful)
Re:real FPU operations (Score:2)
Anyway, it depends on how you're using the floating point numbers: the standard 387 FPU instructions are faster, but the superscalar operations are more efficient when used in their intended role of vector calculations.
Or so I've heard. YMMV.
(BTW: nice sig...)
Just the name brings back memories (Score:3, Informative)
The name is synonymous with speed and power and the unwillingness to cut corners in order to shave a few dollars off the final product. When you buy a Cray, you know you are getting top of the line hardware.
It looks like Sandia wants to build the fastest supercomputer in the world by clustering a few of these monsters, and I have no doubt that they will. Looks like more fun articles about this in the future.
There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction. By feeding in current weather variables into a well-written model, a supercomputer is able to predict to a large degree of accuracy the future weather. Such an application will always be welcome.
I think I'm going to have to fire up the old ][e, the nostalgia is killing me!
Re:Just the name brings back memories (Score:5, Informative)
There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction.
Oh, please. Buy a clue, will ya? There's lots and lots and lots of applications that use supercomputers, or could use if they were more affordable. A few examples from the top of my head:
Materials science, that is ab initio simulations, moldyn, you name it. This alone probably uses > 50 % of all supercomputer cpu time in the world. By comparison, weather prediction and nuke simulations is small potatoes (or shall we say, the simulations as such are big, but the number of people engaged in weather prediction or nuke simulation is really small compared to all the supercomputing materials scientists).
CFD, the automobile and aerospace sectors are big users.
Electronic design.
Seismic surveys, the oil industry uses lots and lots of supercomputers to find oil deposits.
Biology. Gene sequencing, moldyn simulations of lipid layers and whatever.
Climate prediction, somewhat related to weather prediction. Official purpose of the Earth Simulator.
All of the examples above could easily use almost any amount of cpu power you can throw at them. The only thing that stands between a lot of scientists and improved understanding of the world is computing power.
No, what stands in the way is price (Score:4, Interesting)
E.g., yeah, having a 30,000 CPU super-computer to simulate your gene model on would be nice. Forking over half a billion for it, well, it's suddenly not that nice any more.
Having one of those to simulate an electronic circuit, now that would probably rock. Again, paying half a billion for it, suddenly isn't that attractive.
The real question isn't how nice a toy you'd like to have, it's ROI. (Unless you work for the government, and just have a budget you _have_ to blow on stuff, whether you need that stuff or not.)
And in that context, you'd be surprised what you _can_ do with a lot less expensive toys.
Having Cray's custom interconnects sure is impressive, but for a lot of problems they're not even needed any more. _That_ is what killed Cray.
Most RL problems are not really the kind described as "_one_ huge indivisible data set, that you have to process in _one_ huge batch process." They're more like "we have this process with a small data set that we have to run 100,000,000 times." Most design problems or biology problems are really of that kind: run the same thing 100,000,000 times with different parameters.
And as Seti@Home or Folding@Home proved, a helluva lot of those don't really need _any_ kind of shared memory or fancy interconnects. The real ticket is noting that instead of accelerating the batch run 200 times, you could just split it into 200 smaller batches ran on 200 single-CPU machines.
The super-computer solution costs 2,000,000 just for the machine alone, while the 200 PCs solution costs 200,000 or so. I.e., 10 times cheaper. Better yet, the 200 PCs solution is also far cheaper to program. (Anyone can program a non-threaded batch app.) _And_ for that kind of a problem the 200 PCs solution would actually finish faster, since it has no contention issues whatsoever.
Again, that's what really killed Cray and the super-computers. They're techologically impressive, they're a geek's wet dream, but... for 99.9% of the problems out there they're just not worth the price any more.
Re:Just the name brings back memories (Score:2, Funny)
Wrong! There is a third, more used application: Solitare.
Even super computer coders have to wait for results.
I also asked this recently, but didn't get a reasonable answer, do these beasts have screen savers? if so, Are they just blackout type, or busy 3d rendered whizbang super cool ones "Just because we can"?
(I realise you may not be able to answer that, but someone might)
Re:Just the name brings back memories (Score:3, Funny)
It creates random noise that is then fed into the Seti project so our computers have something to do in times without activity.
Re:Just the name brings back memories (Score:2)
Saves energy running idle? This thing takes a lot of power to keep running, and wasting it on a screensaver probably isn't very smart. I'd be surprised if X windows were even installed on this thing. This isn't your home computer!
Active / 100% CPU usage vs. Screensaver. (Score:2)
No computer in existance runs a screen saver because the CPU usage is less than 100%. Desktop computers run screen savers to prevent a still image from burning onto your monitor or LCD (or to lock the terminal if it thinks you've left). This has everything to do with no use of mouse or keyboard, not CPU usage.
When any computer has nothing to do in a timeslice, it generally calls the HALT instruction, which puts it into a low power state until a timer interrupt or something comes along and wakes it u
Re:Just the name brings back memories (Score:3, Funny)
There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that.
Well, when you nuke the site from orbit, you do want to be sure don't you?
Re:Just the name brings back memories (Score:5, Insightful)
Curiously the xt3 IS about shaving dollars off the price. If you go read the origional whitepapers on the system, they go through EXTENSIVE cost-return analysis. They studied their (then-) current generation of cluster systems, as well as future linux/solaris/aix clusters, and rejected them as (interestingly) FAR TOO EXPENSIVE, once the administrative costs are factored in. They then looked at, and rejected, cray's vector solution, the X1. They then decided that the (amazingly) most cost effective solution was to underwrite cray's product development cycle on a wholey new product. Basically they asked for an update to the system they already had. (asci-red i.e. intel paragon++) Nobody was building such a thing. Since cray had a really strong similar product in the 90s. (T3D, T3E) the department of energy asked them to create an update. Some designs never die.
What I'm most interested in is the reliability. One of the biggest difficulties in the T3D engineering cycle was dealing with memory failure. red-storm is going to have 10,000 processors. Lets assume each has 2 banks time 3 dimms (chip-kill) of memory. That means there are 10,000 x 6 x 18 = 1 million+ memory chips in the system. IF 1/100th or a percent of these fail, that's still a lot of memory failures. How well are faults isolated? That's the big question for systems this big.
I'm also a little wary of cray's use of lustre. I've used lustre before, as well as other cluster-FSes. While I'm not aware of other filesystems that will scale to 700+ i/o nodes, I'm not confident in lustre. It's an immature product at best. (I don't mean to disparage the people working on it, it's a neat architecture, but it's a hard problem, and I'm not sure it's ready for prime-time.)
Re:Nuclear Simulations (Score:2, Insightful)
I admire your positive outlook on the prospects of simulations, but as an experimentalist, I find this "soon we won't need experiments at all" (see Rev. Mod. Phys. 64, 1045-1097 (1992), for instance) attitude very dangerous. Simulations and models, even at the first principles level, should never be trusted implicitly. They only sure way to tell how nature works is via experimentation.
I can sort of understand simulating nuclear explosions, but simulating the agi
scary thought (Score:2)
make the plutonium pit out of plutonium mixed with plutonium decay products, layered to match the cross-section of a genuine old warhead pit?
make the explosive pannels that are supposed to all go off perfectly semmetrically out of aging, unstable, unreliable explosives?
make the wiring exclusively out of decaying cables, which have the insulation falling off?
Hmm. better dismantle and scrap these guys as soon as any one part begins to go.
All the real life te
Opterons and PowerPC together (Score:5, Interesting)
The XT-3's biggest comptetitor in this segment must be the BlueGene/L type super computer made by IBM. The processors in Blue Gene/L is a custom built dual core version of the PowerPC 440 with built in high speed interconnects.
Just like IBM have a finger in all the future game consoles, they seem to have a finger in several of the next generation super computers also. Nice going IBM.
Re: (Score:3, Informative)
Re:Opterons and PowerPC together (Score:3, Insightful)
It's not that they're the best thing since sliced bread, it's mainly that all their competition went down the chute for one reason or another.
HP/Compaq/DEC was the king of supercomputers. Now they're only supporting their formerly glorious products, with practically nothing new comming to replace it.
Sun seems to really be sitting on their ass.
Int
Re:Opterons and PowerPC together (Score:3, Interesting)
* first, sgi still makes and sells supercomputers, they are far from faded. they also own cray (or did).
* tandem, bought by compaq, we all know what happened there.
* hp sells a superdome once in a while. but nobody seems excited about their itanic systems.
* sun, rotting with their out of date cpus.
* fujitsu is doing well in the supercomputer market.
* nec is also successful.
* ibm, of course.
and you mentioned motorola? you're joking, i hope.
the largest purchasers of superco
Re:Opterons and PowerPC together (Score:2)
Sounds like an opening for...
Google!
Now that I think about it... they have massive experience with huge data systems!
Re:Opterons and PowerPC together (Score:2)
The real difference in this system is the high bandwidth shared memory. Blue Gene has hardware support for shared memory, but the software appears to be strictly MPI based. (at least in the first revision, and according to what I've read, this may be prelimina
Sic transit gloria mundis (Score:2, Insightful)
Re:Sic transit gloria mundis (Score:2, Informative)
That's why they use mere 100-series Opterons: they need only one HT link per CPU. Because the whole is not based on HT interconnects.
Really, loosely-coupled cluster my ass. This machine *is*
Re:Sic transit gloria mundis (Score:2)
I'd link it, but the site is down...
The first test of the new Cray (Score:3, Funny)
Strangely, it took roughly a week. The second test was a simulation of the moderation results of this post.
It received a +5 Funny, which puzzled researchers, as it is currently modded -1 Offtopic.
Damn you Schroedinger!
Leather seats? (Score:2)
AMD gets about... (Score:3)
Also, a 30,000 cpu complex, AMD must be making a tidy sum.
Re:AMD gets about... (Score:2)
Re:AMD gets about... (Score:2)
My old company managed to get some seriously expensive enterprise software for just 10% of the retail price because we were able to convince them that having us as a client would be a publicity coup for them....
Intersting note (Score:3, Interesting)
This is the first time i see a shipped linux with this file system. Now the intersting part is that lusterfs is made for linux clusters, but this monster is not a cluster... any body can shed some light?
Re:Intersting note (Score:2)
well, sort of.
There are thousands of compute nodes, all of which get i/o services from dozens, or hundreds of i/o nodes. These i/o nodes run linux, several instances of linux. Basically the i/o nodes ARE a cluster, though not a compute cluster, and not necessarily a symmetric cluster. The i/o nodes run lustre in very much the same way that a cluster system would (though they can take advantage of hardware features not present on commodity clusters).
The real difference in this syste
$2 million for a 200 processor system ? (Score:2)
And it seems _really_ low.
I would expect a price at least twice higher.
Ok, $2 million is starting price, but on Cray's website they say the configuration can be as "small" as 96 CPUs.
So it's maybe $2 million for 96 CPUs.
(Still fairly cheap for a Cray, if you ask me)
700kgs, 75dB and 14kW... (Score:4, Funny)
Finally ... (Score:2, Funny)
... Back in my day .... young whippersnapper (Score:5, Interesting)
So come on, ante up. How many remember being awed at the mere sight of old Crays back in the day? Like the Cray-3? I remember the first time I saw a Cray .... thing was in an anti-static environment. To access it, one had to pass through an airlock and be "decharged" or "depolarized" etc. Basically they some how charged the air to get rid of static electricity. Then you had this system that was running *in* liquid! Take that "Oh I'm so cool cause I have a l337 haX0r water cooled CPU" overclockers
They (Cray) were so proud of this accomplishment that the upper portion of the cabinet was some kind of plexiglass so you could see the fluid as it moved, and moved wiring and what not with it. Very surreal feeling, almost like the thing was breathing.
And what about the Cray-1? Wasn't that a true testiment to 70's *art* and sculpture? The thing looks like some kind of freaky bus station bench with it's odd red and white panels and black base. Though, I don't know if they all looked like that, maybe you could get them in other colors?
Ahh .... those were the days.
are you sure you remember seeing the Cray 3 ? (Score:3, Informative)
Re:are you sure you remember seeing the Cray 3 ? (Score:2)
Well, anything is possible. It may not have been the 3 I saw. Though I seem to remember being able to see over it, and I think almost every other Cray was at least 6 foot tall. But I don't know, it could have been recessed into the floor. I was quite a bit younger at the time, obviously. I was more focused on the tour guide guy showing off the fact that electronics were in liquid. At the time I thought if it was wet, it conducted electricity, so that kind of blew my mind.
I'd love to find an old Cray, li
Was it the 1a or 2 maybe? (Score:2)
2 [army.mil]
Re:are you sure you remember seeing the Cray 3 ? (Score:2)
Re:are you sure you remember seeing the Cray 3 ? (Score:3, Interesting)
Looked around on the net, as well as a couple other /.'rs here, and someone posted a link here to a 2 and I found a pic of a 2 with the waterfall system that was mentioned by another person, and I must accept defeat within the loosened strands of my unraveling mind.
It was indeed a Cray-2 that I remember so vividly. Nevertheless, still an extremely exotic machine. Very much the Ferrari F40 or McLaren F1 of super computing. You've seen pics, maybe even seen one at a car show, but you know you'll never be
Of interest to Cray-3 info (Score:3, Interesting)
Cray-3 memories by Steve Gombosi From a comp.unix.cray posting
Graywolf ("S5") was installed at NCAR. Like all NCAR supercomputers, until fairly recently, it was named after a Colorado locale.
This was the *only* Cray-3 shipment, installed in May 1993, the machine was a 4-processor, 128 Megaword system.
Two problems in the Cray-3 system were uncovered as a result of running NCAR's production climate codes (particularly MM5): a problem with the "D" module causing intermittent problems with parallel co
Re:... Back in my day .... young whippersnapper (Score:2)
compared to ASCI-red, the system that red-storm is replacing, xt3 looks increadible. Yes it's a long row of rectangular racks, but at least they are stylish racks. Intel built asic-red in beige box style. Oh well. function over form I suppose.
Re:... Back in my day .... young whippersnapper (Score:2, Informative)
Before that was the Cray-2 (a.k.a World's most expensive aquarium")? In case anybody's interested, I believe they used Fluorinert as the liquid, as it wouldn't swell the PC boards, short anything out, or cause anything to corrode.
A note, the Cray-3 was created by Cray Computer Corporation of Colorado, whereas the Cray-1 was made by Cray Research of Wisconsin. In ~1990, Seymore wanted to start working on computers using gallium arsenide instead of s
I saw a demo of one... (Score:2)
You don't have to begin to imagine (Score:4, Informative)
Re:You don't have to begin to imagine (Score:5, Interesting)
More interesting is this spec:
Acoustical Noise Level: 75 dBa at 3.3 ft (1.0 m)
For comparison, that's roughly the same as an average vacuum cleaner when you're operating it, or maybe a good-sized pickup truck passing you in the next lane.
And remember, this value is *per cabinet*. You have to do a weighted sum over all the cabinets in an installation to get a true dB level. I wonder whether the maintenance people will have to use noise-level exposure limits for this baby.
And here I was, complaining about the quiet whine of my PC's fan.
Re:You don't have to begin to imagine (Score:2)
Re:You don't have to begin to imagine (Score:5, Interesting)
that's amazing. how did the cray guys get a kilovolt-ampere that is not equal to a kilowatt? just goes to show you the power of fast interconnects.
Re:You don't have to begin to imagine (Score:5, Informative)
It's fairly common to get a KVA !=KW.
Overall power used by a load is expressed as S=P+jQ, where P is the "real" power and Q is the reactive power (capacitive/inductive from motors, fluorescent lamp ballasts, etc).
While the "units" of S, P, and Q are power=voltage*current, S is generally expressed in VA, P in W, and Q in VAR(volt-ampere reactive) to differentiate the variables. Because the magnitude of S=sqrt(P^2+Q^2), S will always be greater than or equal to P (in this case, 14.8kVA=sqrt((14.5kW)^2+(+-2.965kVAR)^2)
Re:You don't have to begin to imagine (Score:2)
Re:So......the cost compared to? (Score:2, Informative)
Re:So......the cost compared to? (Score:5, Informative)
There's not a lot to compare. We're talking apples and oranges. It's like asking to compare a PowerMac G5 with a bunch of PC parts scattered on the floor as desktop machines. Sure, you can put the PC together, load it with Linux, tinker with it to get everything working, etc. but that's a fair amount of work compared taking the PowerMac out of the box, plugging it in, turning it on, and having everything work perfectly.
Read the specs [cray.com], particularly with regard to the interconnect, system administration, and hardware and software reliability features. This thing is seriously engineered to be massively parallel system with top of the line hardware and software to support and maintain that, as well as extremely impressive reliability features.
Jedidiah.
Re:So......the cost compared to? (Score:3, Informative)
The VT Supercomputer specs vs the Cray specs page you pointed to:
CRAY 460 GFLOPS per cabinet (96 processors @ 2.4 GHz)
Apple - if my math is right - 420 GFLOPS (100 processors @ 2.0Ghz)
The new specs for the specialized VT Supercluster are pretty impressive.
Their throughput and interconnect is most likely weaker - but still VERY strong with fiber channel.
Re:So......the cost compared to? (Score:2)
Oh, why is that?
The VT Supercomputer specs vs the Cray specs page you pointed to...
Right, so if you compare Virginia Tech forking over money not just for Xserves, but also for all the interconnects, the cooling, setting up the system, developing maintenance, monitoring and administration software etc. to buying a Cray and plugging it in they have almost comparable performance?
If I buy a cheap AMD CPU, cheap motherboard etc. overclock it,
Re:So......the cost compared to? (Score:3, Funny)
No, we're talking Apples and Crays... Didn't you read the post before replying?
yum!! (Score:2)
Operating System:
UNICOS/lc--Components include SUSE(TM) LINUX(TM), Cray Catamount Microkernel, CRMS and SMW software
I remember installing SuSE linux long ago on a ppc. The first thing I did once I got X running was fire up the gimp and doctor their logo so it said "Welcome to DuDE Linux...."
To answer your question, it looks like they've patched SuSE to run on the Cray Catamount Microkernel. Since there's no way in hell I'm going to buy one of these for my modest word processing and
Re:software (Score:5, Informative)
UNICOS is usually a safe bet. In this case the specs [cray.com] say UNICOS/lc, which is made up of "SUSE(TM) Linux(TM), Cray Catamount Microkernel, CRMS and SMW software"
I'm not entirely clear how to interpet that, but I think it runs as follows: It runs the Catamount Microkernel as the kernel, and uses SUSE for everything else (so we have SUSE Linux, without the Linux - all of a sudden that GNU/Linux stuff starts to make sense). The CRMS is their interconnect management and monitoring software, and SMW is the System Management Workstation - which I'm guessing is their administration frontend.
It's worth noting that that's some pretty serious software there (because Cray has a lot of experience dealing with large systems) - you can bet that the management and monitoring software is some very serious stuff.
This thing is to a beowulf cluster what a dual G5 PowerMac is to homebuilt PC system running Linux From Scratch. It's going to work flawlessly "out of the box" with a smooth and polished interface that lets you get done everything you want to do simply and easily. You can of course make your home built PC with LFS work just as well, it's just going to take you an awful lot of effort.
Jedidiah.
Re:software (Score:2)
hybrid system with multiple kernels (Score:3, Informative)
Re:software (Score:4, Informative)
catamount is the kernel that runs on the compute nodes. IT's a tiny kernel that packages up the OS service requests, and sends them, over the interconnect, to an OS or I/O node, which does the real work of the operating system. catamount is a descendant of PUMA, which came from Cougar. These are heavily derived from work done at caltech. (I believe CMU, and one of the UTexas schools also played a role, but am not sure). The idea is that the microkernel is small and unobtrusive, and it gets the hell out of the way so the application can use the CPU as much as is possible.
The OS and I/O nodes run linux, and provide services to the compute nodes. This is probably, but it could just as easily be running as a user-space daemon on the OS node. (Though you might have to do some mem-copys that way, which would lower performance)
NOTE: Though these nodes take advantage of some of linux's features (like the lustre file system) they do NOT necessarily implement these features for the system as a whole. They probably provide a minimal set of features necessary for the sorts of problems that the xt3 runs. All the scheduling work that has gone into more recent linux kernels is of little use, as the compute nodes have their own scheduler, probably more closely tied to the batch dispatcher than to the linux kernel. To say that the system runs linux is true, but a little misleading. It's a very different linux than what runs on my desktop, and it's used in a very different way.
Re:MP performance overhead (Score:4, Informative)
You can't really compare something that can hold thousands of CPUs to something powered by Abit that can hold two, anyway. It's like comparing apples and a strange bug thing with tentacles.
Re:Not even trying... (Score:2)
Re:cray (Score:5, Interesting)
Although it's true that Cray was not growing strongly before the SGI buy-out, it was not failing either. It could have kept running quite happily for many years, but in the bizarro-world of Wall Street, a company which is not growing is dying. I so love it when economists use biological terminology for corporations. In Wall Street's thinking, the only healthy growth would be a cancerous tumor.
Anyway....
The whole SGI-period of Cray is actually quite fascinating, and I suspect the true story will never be fully known. Lots of SGI engineers had their non-Cray technology branded with Cray marketting names, most egregiously LegoNet becoming CrayLink. Lots of Cray folks - aka. Crayons - felt that the core of their company was gutted by an SGI operation which didn't care for the extreme high-ends of HPC.
One rumor I heard, from a well-placed source, is that the Cray merger with SGI was primarily arranged by the USG. The intelligence services have huge investments in both company's products, so the merger between them made sense. I was told that as a quid-pro quo, the USG had an in-principle agreement to continue purchasing Cray gear to provide enough revenue inside SGI to keep both Cray architectures alive. However, certain parts of SGI felt that the US government didn't live up to their agreement, negotiations to rectify that weren't successful, and so SGI management defunded significant aspects of the Cray engineering work.
Also, FYI, Cray is one of those companies which will never totally go "belly up" anyway. Given the sensitivity of the work which they did, their support databases alone are full of sensitive and/or classified information. Should the company cease trading, it would be acquired by a shelf company whose sole function is to ensure this data would remain private. That's been the fate of almost all of the now-defunct supercomputer and high-end graphics companies who formerly supplied the defence and intelligence market.
Wall Street (Score:3, Informative)
Surrealistic point in case: at one point 3Com had a lower market value than the Palm daughter-company. Basically if you subtract the value of the Palm shares, the whole rest of 3Com was actually worth a _negative_ value for the stock market.
And we're talking divisions which were making a tidy profit. Yet they were apparently worth a _negative_ number.
No, it's not a joke. Rol
Re:cray (Score:3, Funny)
What good is a supercomputer without blinkenlights ? They just don't make them like they used to...
Re:newfangled buzz. (Score:3, Interesting)
Re:fucking death labs (Score:2)
Re:The math for a comparable Xserve system (Score:5, Insightful)
What a value!!
That is, until you throw a tightly coupled problem at it and the Cray is 10 times faster because it has much better internode bandwidth and lower latency.
And, you forgot to count the cost of the InfiniBand interconnect that the VT cluster used? That's a couple grand per node.
Bottom line, apples and oranges. If your applications is easily parallelizable (i.e. doesn't require much communication between the nodes) you'd be stupid to piss away your money on a "real" supercomputer instead of a cluster. And vice versa.
Re:The math for a comparable Xserve system (Score:2)
If your application is not parallelizable, the supercomputer pisses away on you ?
Re:The math for a comparable Xserve system (Score:2)
If your application is not parallelizable, the supercomputer pisses away on you ?
Only in Soviet Russia.
Re:The math for a comparable Xserve system (Score:3, Informative)
#1 RAM: $3000 for the G5 cluster node includes 512mb ram. Most places demand atleast 2gb ram per CPU, we require 3GB ram per CPU in all new system purchases. This brings the node price (from apple.com) to $6500
200x $6500 = $1,300,000
#2 Racks and power: Each rack can hold about 32 machines (without getting way to hot/dense) for 200 nodes, this would be about 7 racks.
7x $1200 = $8400
#3 Interconnect: No HPC system is usefull without an interc
Re:Internode bandwidth - cheap solutions? (Score:2)
'Super-computing on a budget' is kind of an oxymoron. You either pony up, or you're not super-computering; you're parallel processing on a big cluster.