Cray Wins $52 Million Supercomputer Contract - Slashdot

Slashdot is powered by your submissions, so send in your scoop

×

Cray Wins $52 Million Supercomputer Contract 133

Posted by CowboyNeal on Thursday August 10, 2006 @09:41PM from the next-big-thing dept.

The Interfacer writes "Cray and the U.S. Department of Energy (DOE) Office of Science announced that Cray has won the contract to install a next-generation supercomputer at the DOE's National Energy Research Scientific Computing Center (NERSC). The systems and multi-year services contract, valued at over $52 million, includes delivery of a Cray massively parallel processor supercomputer, code-named 'Hood.'"

This discussion has been archived. No new comments can be posted.

Cray Wins $52 Million Supercomputer Contract

Search 133 Comments Log In/Create an Account

Comments Filter:

Apparently not. (Score:4, Informative)

by Loconut1389 ( 455297 ) writes: on Thursday August 10, 2006 @10:33PM (#15886288)

http://news.com.com/2100-1001-237517.html [com.com]

Even I didn't notice that happen. Apparently Tera bought Cray from SGI and changed the name back for recognition purposes.

Parent Share
twitter facebook
You only get this joke if yer from New England... (Score:5, Informative)

by jpellino ( 202698 ) writes: on Thursday August 10, 2006 @10:52PM (#15886408)

H. P. Hood is a beloved ages old dairy company that started outside Boston.
They had giant milk bottle ice cream stands, one stood outside the old Computer Museum on Congress St.
No slight intended concerning ethnic neighborhoods.

Parent Share
twitter facebook
Re:Cray "getting it" might let them come back. (Score:3, Informative)

by convolvatron ( 176505 ) writes: on Thursday August 10, 2006 @11:04PM (#15886467)

its pretty much a cluster. the seastar is a message
passing engine. its distributed memory and the OS
doesn't share any state (except for a library that
does filesystem indirection)

Parent Share
twitter facebook
A few notes to clear things up (mod me up!) (Score:5, Informative)

by Anonymous Coward writes: on Thursday August 10, 2006 @11:46PM (#15886669)

Unfortunately this seems to be one of the topics that the slashdot bias and ignorance comes out in full force on.

* Clusters can not compete with supercomputers. They aren't even in the same market space. Cray doesn't make clusters, and clusters have not taken away their business.

* Cray doesn't take off the shelf hardware and sell it as fancy clusters. Actually look into the details of these machines. While processors sometimes are off the shelf much of the surrounding hardware and software is custom.

* This 50 million contract is one of many that cray has. They also just recently in the news got a 200 million dollar contract. They also are a contender in the DARPA HPCS thing. That could be a lot more if they get it. They aren't dieing.

* They aren't owned by SGI any longer. They were bought from SGI by Tera who renamed themselves cray.

* The top500 list is nonsense. It is based off of 1 benchmark (linpack.) That benchmark doesn't stress the interconnect too much and can allow clusters to appear to compete with supercomputers if you manage to ignore all the other factors. The number of teraflops has very little to do with performance. To see a more well rounded and thought out measurement of top systems check out HPCC's website. http://icl.cs.utk.edu/hpcc/hpcc_results.cgi [utk.edu]

* Bluegene doesn't kick Cray's ass. See the above and then see how it really performs overall. In some areas it does better and in others it just gets destroyed. Depending on the real world problem a full size blue gene may not even be able to perform as well as a much smaller Cray.

If you don't know what you are talking about look it up before posting. Just because it's the common belief doesn't mean there is any truth to it!

Share
twitter facebook
Re:Is this big as far as contracts go? (Score:3, Informative)

by Anonymous Coward writes: on Friday August 11, 2006 @12:04AM (#15886750)

I work in the industry. 'Course it's easy for an AC to say that, isn't it?

$52M is rather large nowadays. At least, for a 'commodity' part cluster it is. For a 'vector' supercomputer, it may be only medium sized.

You can easily break the top 50 for less than $10M. A couple thousand nodes, each with two dual-core Opteron/Xenons, InfiniBand or Myrinet (maybe 10GigE), and a compiler that optimizes better than gcc... no problem.

That being said, NERSC is a pathologically tough customer. Cray will have to work very hard to earn each and every penny they get. It may very well be a 'live or die' deal for Cray.

Parent Share
twitter facebook
Re:A few notes to clear things up (mod me up!) (Score:3, Informative)

by bobcat7677 ( 561727 ) writes: on Friday August 11, 2006 @01:15AM (#15887068) Homepage

I dunno man. First of all, asking people to mod you up is kinda lame.

Secondly, to say the computers that Cray sells is not "off the shelf" can be argued depending on how you look at it. Today's Crays are not the fully proprietary machines of yesteryear. They all use AMD Opteron processors and leverage the onboard memory controller and hypertransport bus to make a processor fabric simple. The main custom items in the system are the "interconnect routers" that tie all the hypertransport busses together. Even the FGPA components that facilitates handling specific custom tasks on hardware are somewhat "off the shelf" and just woven into the greater hypertransport happiness fabric.

Sure, the average person is not going to be able to build a "supercomputer" like this with stuff they bought off the frys shelf. But are we talking about "off the shelf" as in the average electronics store? Or "off the shelf" as in parts that are pre-existing and available on some shelf somewhere and have published documenation?

Benchmarks of any multiuse system are never universal. They best they can do for a large list like that is to use a benchmark that can reasonably represent a common use of such systems. Cray has been good about having systems that can be configured to perform exceptionally for very specific applications. Modern offerings like the XD1 are no different in that respect as they offer that in the FPGAs. To say they are not in the same market space as custers is like saying MySQL isn't in the same market space as PostGreSQL. They both have their strong points but there is many instances where a user has to decide which to go with.

I'm going to stop there...time for sleep.

Parent Share
twitter facebook
Re:A few notes to clear things up (mod me up!) (Score:2, Informative)

by mjsottile77 ( 867906 ) writes: on Friday August 11, 2006 @02:33AM (#15887308)

"Benchmarks of any multiuse system are never universal. They best they can do for a large list like that is to use a benchmark that can reasonably represent a common use of such systems."

The linpack benchmark used to do the top500 list is a basic, dense matvec solver algorithm. (See wikipedia : http://en.wikipedia.org/wiki/LINPACK [wikipedia.org]) This algorithm used to be the core of most scientific codes, back in the days when you would simply use the computer to solve a simple (but large) set of equations. In the last decade(s), the scientific world has moved to unstructured problems where the solvers are no longer solely matvec operations. Adaptive mesh methods, multigrid, and other similar "modern" methods in scientific computing do NOT have the same behavior as a basic dense matvec - a simple case would be considering a matvec problem where one deals with sparse matrices. Life gets even worse if you try to use linpack to reason about how a machine would perform on something highly data dependent, such as an n-body code or molecular dynamics simulation.

Linpack is really an archaic relic of the past, and it is NOT a benchmark of a multiuse system. It is a benchmark of a supercomputer from 15+ years ago. This is not news in the parallel computing world -- many efforts such as ParkBench, NAS Parallel Benchmarks, Livermore loops, etc... have been proposed as replacements for linpack to better cover the sorts of applications that a real "multiuse" systems will run. Unfortunately, the fact that most procurement folks and politicians who help fund these big govt. machines do not understand that linpack is a total waste of time have caused it to persist, contrary to the desires of people who either use the systems, or spend their careers studying performance issues in big parallel systems.

Parent Share
twitter facebook
Re:Who else bid? (Score:3, Informative)

by cannonfodda ( 557893 ) writes: on Friday August 11, 2006 @04:21AM (#15887589)

I would imagine that IBM probably did bid. They would be crazy not to for $52M.

But....... "the Hood system installed at NERSC will be among the world's fastest general-purpose systems".

Nersc are looking for general purpose computing systems to fill the needs of 2500 users. Blue gene is blindingly fast at some things, but general purpose it aint. I've benchmarked both the XT3 and Blue Gene with a set of general Scientific Codes and the opteron delivers much better general price/performance for a representative set of tasks. Blue gene will fly if you have the time to get REALLY low level in your optimisation but most scientists don't have the time or knowledge to start dealing with that ind of thing.

Parent Share
twitter facebook
Re:Just anounced (Score:1, Informative)

by Anonymous Coward writes: on Friday August 11, 2006 @07:08AM (#15887964)

Guess somebody's a liitle mad about having to leave their H2 parked.

The part of the DOE that uses supercomputers does nuclear simulations. They don't give a crap about your unwise car choice.

Parent Share
twitter facebook
Re:A few notes to clear things up (mod me up!) (Score:3, Informative)

by moosesocks ( 264553 ) writes: on Friday August 11, 2006 @08:34AM (#15888225) Homepage

I'd say go ahead and mod him up.

He's right. For *ALL* computing tasks, using the right tool for the job can increase performance exponentially. Slashdotters should know this -- A 400mhz GPU can outperform a 3ghz CPU on vector and matrix operations by huge leaps and bounds

Clusters are just another tool that work very well for very specific jobs, and very poorly for others. These jobs are mainly those that can be massively parallelized (ie. brute-forcing a math equation -- Computer A should try these values, Computer B should try these values, etc...). Anything more complex than that puts a huge strain on the system being used to interconnect the machines. Once you start incorporating a fast interconnect system, the cluster begins to resemble an extremely inefficent supercomputer with multiple points of failure. At this point, it makes more sense to just use a Cray.

Over the past few years, for the first time, it's been possible to use the same chips in supercomputers as in desktops -- specifically the Opteron and the PPC970. As a result, consumers got more powerful chips, and supercomputers got a lot cheaper due to economies of scale. As an added bonus, now that the R&D is combined into one architecture, we're getting faster chips on a more regular basis.

AMD did a lot of things right with the Opteron. They made a series of consumer chips that were inexpensive, and blazing fast. They then took the same architecture, and made enterprise-grade chips that were rock solid, equally fast, energy-efficent, and still pretty cheap. HyperTransport is also an incredible technology, in that it's suitable for inexpensive machines and supercomputers alike. Itanium was none of these things.

I for one, am glad to see supercomputing coming back into fashion. The DOE's working on a lot of good science that will be essential for our survival in the long run, and the government seems to be providing them ample funding. Sure, NASA may do some cool science, but it's the DOE that's working on more meaningful things that can be put to use here on earth for the betterment of mankind. Perhaps the only positive thing to come out of the political mess right now is that the world is quickly realizing how desparately we need to move away from an oil-based society.

Parent Share
twitter facebook
Re:Just anounced (Score:3, Informative)

by crgrace ( 220738 ) writes: on Friday August 11, 2006 @09:12AM (#15888423)

The DOE runs our system of national laboratories, and is the successor to the Atomic Energy Commission. They aren't all that concerned with gasoline, as that is a small part of their work. They mostly work on nuclear weapons, fusion research, high-energy physics, renewable resources, etc. I used to work at Lawrenece Berkeley National Lab designing subatomic particle detectors. I couldn't give a rats ass about how much you spend for gas.

Parent Share
twitter facebook
Re:A few notes to clear things up (mod me up!) (Score:3, Informative)

by forkazoo ( 138186 ) writes: <wrosecrans@@@gmail...com> on Friday August 11, 2006 @11:54AM (#15889545) Homepage

Unfortunately this seems to be one of the topics that the slashdot bias and ignorance comes out in full force on.
I agree completely.
* Clusters can not compete with supercomputers. They aren't even in the same market space. Cray doesn't make clusters, and clusters have not taken away their business.
This is not exactly a wrong statement, but it is incredibly broad. First off, Cray does make clusters. At a fundamental level, the basic separate-box clusters connected by Ethernet are the exact same thing as a big massively parallel system. They are on different ends of the spectrum, certainly. The sort of interconnects used by Cray certainly make their systems much more suited to certain workloads than more basic clusters. In practice, even Single System Image vs. separate boxes isn't that big a distinction. And, basic clusters certainly do compete with and take business from Cray. If basic clusters weren't an effective means of computing, then there would be a much larger market for the supers. If I refer to "clusters" in this post, I am probably referring to separate-box basic clusters -- like the parent poster seems to be. As unclear as this terminology can be, it is the way the term is usually used.
* Cray doesn't take off the shelf hardware and sell it as fancy clusters. Actually look into the details of these machines. While processors sometimes are off the shelf much of the surrounding hardware and software is custom.
This point I fully agree with. The high end interconnects and whatnot that you see in supers are on a very different level from what you see in the more basic clusters. For the workloads where the supers kill the basic clusters, it's usually related to comms latency between the nodes, which is all about the crazy interconnects.
* This 50 million contract is one of many that cray has. They also just recently in the news got a 200 million dollar contract. They also are a contender in the DARPA HPCS thing. That could be a lot more if they get it. They aren't dieing.
I'll take your word for it. I haven't specifically kept up with Cray's contracts, though it wouldn't surprise me if they are doing pretty well.
* They aren't owned by SGI any longer. They were bought from SGI by Tera who renamed themselves cray.
Yup, no argument there. (See, I may be a jerk, but at least I'm not arguing with everything! ;) )
* The top500 list is nonsense. It is based off of 1 benchmark (linpack.) That benchmark doesn't stress the interconnect too much and can allow clusters to appear to compete with supercomputers if you manage to ignore all the other factors. The number of teraflops has very little to do with performance. To see a more well rounded and thought out measurement of top systems check out HPCC's website. http://icl.cs.utk.edu/hpcc/hpcc_results.cgi [utk.edu]

I wouldn't go so far as to call top500 "nonsense." It is a very specific benchmark. People do tend to look at a very narrow, specific piece of information, and generalise it completely. *That* is nonsense. You have to be aware of what you are reading when you see stuff like benchmark numbers. Benchmarking can be very complex.

That said, there are some real world workloads that work quite a lot like linpack. Consequently, there are a lot of very real world tasks where a cluster is an appropriate tool. My personal interest in HPC tends to focus on 3D rendering performance. This tends to need a lot of FLOPS, and relatively little bandwidth. For the guys who are doing really bandwidth/latency intensive stuff, the basic clusters are useless. (I'm told that stuff like weather sim falls into this category, but I can't comment on the details.) Without specifying a workload, saying th
Read the rest of this comment...

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

The only possible interpretation of any research whatever in the `social sciences' is: some do, some don't. -- Ernest Rutherford