U.S. Helps Finance New Cray Development 61
Durinia writes "SGI has announced a few details on their next Cray vector supercomputer. The press release is mostly about them getting government support for the R&D. It does, however, mention that it will be combining the powerful Cray vector processors with SGI's ccNUMA architecture for big-time scalability."
Re:Is this the same Cray SGI just Dumped???? (Score:1)
You know not of what you speak... (Score:3)
A supercomputer doesn't have what you'd consider an operating system. It's a front-end computer that does all I/O, provides the usual operating system services, and controls the supercomputer. Linux is perfectly practical for the front-end. It would be nice to see a Linux in there.
Bruce, what are you talking about? I suspect you haven't seen a Cray machine for quite some time.
Most of the current Cray vector machines (like our T90) have their own OS (UNICOS) and IO subsystem. The IO's in a physicially separate box, like in a classical mainframe, but it's still part of the machine. There are generally a couple workstations (usually Suns) attached directly to the machine, but those are system consoles and monitoring stations. A Linux box might be appropriate for one of these monitoring stations, but that's about it. And if you think a Linux machine could handle the I/O that a Cray's capable of, you're insane. We're talking multiple GB/s.
The idea for the SV2 (as it was explained to us at a Cray User Group workshop last fall) is that it piggybacks on an MIPS-based SN1 (next generation SGI Origin ccNUMA machine). That implies IRIX (with features ported from UNICOS), not Linux. I doubt Linux on Intel or MIPS will be ready for the kind of prime-time SGI's going to be selling the SV2 for by the time they ship.
(Before you flame me for putting down Linux in this particular context, consider the following: I've been using Linux as my primary OS at home since '93, and I'm one of the guys working on the Beowulf cluster of SGI 1400Ls at OSC. I'm rooting for Linux too, but it's not always the right answer.)
Re:So what lenght keys do you have then? (Score:2)
this is only a conjecture as for now.
laurent
---
Re:You know not of what you speak... (Score:2)
Re:NSA... (Score:1)
Re:SGI Cray?? oh please, linux has it beat right n (Score:1)
Steve Ruyle
Alaska Supercomputer is a Cray (Score:1)
You can read more about the current hardware at http://www.arsc.edu/resources/Hardware. html [arsc.edu]
Re:You want it, you pay for it ... (Score:1)
I have just sat through a Beowulf-type cluster presentation yesterday and the upshot is that they simply cannot touch some problems. If there is significant communication going on between nodes, especially if it comes in bursts, present-day Beowulf does not cut it.
We are seeing 95%+ efficiency on 64 or even 128 CPUs on a Cray T3E, while a Intel cluster connected through switched 100 BT shows something like 75% at only 16 nodes. Going to Gigabit ethernet, this figure does not improve much, mostly due to inefficient Gigabit drivers. One is then forced to look into much faster, but much costlier interconnects.
Then there are also the CPU-to-memory bandwidth and the latency of disk I/O to consider.
Simply stated, Beowulf-type clusters today cannot touch 'real' supercomputers for many types of applications. Those applications do not decompose into embarassingly parallel tasks like rendering frames or brute-force attacks on encryption do.
Now, if only SGI/Cray would develop a T3F based on the 21264 Alpha CPU.
We'll keep dreaming and hoping
well, so much for a "free market" (Score:1)
Re:So what the heck is a vector based computer any (Score:3)
Vector processing has one big advantage over massively parallel processing. Essentially, if you have a uniprocessor vector machine, and any appreciable amount of your code is vectorizable, you reduce total CPU time as well as wall-clock time. With massively parallel systems, you always pay for reduced wall-clock time with increased CPU time to synchronization overhead. A search on Amdahl's Law should turn up some interesting reading.
Looking back at some of my old documentation (~1990), I have these stats for a Cray Y-MP with a memory cycle time of 6 ns, and 2 FPUs/CPU:
1 CPU : peak throughput 333 MFLOPS
8 CPUs: peak throughput 2667 MFLOPS
These throughputs assume 100% vectorizable code. For the single CPU Y-MP running LINPACK with a vector size of 300 gets you 187 MFLOPS. However, with a LINPACK vector size of 1000, the throughput is 308 MFLOPS, which is approaching the peak throughput pretty closely.
I wish I had something that stated just how deep the vector registers are on a Y-MP. I suspect it's somewhere close to 100 (i.e., 100 stage pipeline!).
Anyway, this post is too long. Bye now.
Bill Wilson
The Keeper of Cheese [netdoor.com]
You want it, you pay for it ... (Score:5)
As for the US support of Cray, well, jaded veterns of comp.sys.supercomputer and HPCC practitioners are well aware of the historical situation with federal funding, technical advantages and bang-for-buck comparisons with Fujitsu and NEC vector computers. For people interested in what the Japanese are doing, I believe NEC are planning on introducing a 1 Teraflop machine with the goal of hitting 5 Teraflops for their Whole Earth Simulator project [nasda.go.jp] . Some scientists' idea of heaven is a dedicated vector box and for their purpose and types of code, it is a valid desire.
The SV2 is a curious beast, effectively the first stage in the merging of the Origin cc-NUMA memory subsystem and vector chips. You can think of it as a hybrid box allowing various combination of graphics pipes, MIPS/Merced nodes and vector nodes. The gripe of some people is that they are looking for a successor to the top-end T90 and they are impatient. However, developing at the high end is always trickier than people expect (witness Merced) as you need to increase capabilities along a multitude of dimensions (memory latency, I/O subsystem, heat dissipation, networking) rather than relying from the automatic boost from Moore's Law. Unfortunately there are very few applications which demand absolute performance regardless of actual cost.
To paraphrase crass consumerism, if you have to ask about the price, you can't afford it
LL
Supercomputer >> Computer (Score:2)
Supercomputer != Beowulf! Network latency sucks in comparison, for tasks like this. Only readily paralellizable tasks that *don't* need lots and lots of RAM at once, and *don't* benefit greatly from the vector operations are better for Beowulf. In fact, some cryptography problems have been mostly solved on regular computers and then finished on a Cray, *because* the Cray did that stage of the problem so much better.
Of course, it won't be Cray without Seymour.
It must have been nice to build a supercomputing couch, though, back in the day.
Re:I seems like a good thing. (Score:1)
Re:NSA... (Score:1)
some other things would profit much more of new supercomputers. what they are really good at is solving Partial Differential Equations either on really big domains (weather forecast) or with a high precision ( nuclear weapons simulation)..and i can understand why the US government might be interested in that.
Laurent
---
Of course =) (Score:2)
One could very well chalk up the price difference towards reducing communications/messaging latency between computing units, cooling solutions, memory architectures, etc.
And hasn't it already been shown, for at least brute force algorithm cracking techniques, that massively parallel computing does work? So that a Beowulf is okay for such a system?
Of course this is different than using a more elegant and efficient algorithm to handle encryption/decryption attacks, but that is out of my/our ken as well.
-AS
Re:Supercomputer >> Computer (Score:1)
Re:So what the heck is a vector based computer any (Score:2)
It was discovered by some chip makers a couple of years ago (*grin*).
Beowulfs can't do everything. (Score:1)
For example, some data structures are so irregularly organized (unlike the average matrix inversion) and use algorithms that alter their structure in mid run. All you can do is parallelize some of the for(;;) loops and then resynch constantly.
For those you have no choice but better big iron.
Re:Linux involvement is possible for the front-end (Score:1)
The confusion may center around three points.
I hope this clears up some of the confusion.
-Dean
By now we all know the G4 wins... (Score:2)
"Thank you Seargent...you may put away your tank."
Re:Is this the same Cray SGI just Dumped???? (Score:1)
For once in my measley life I am not being sarcastic, I think its a question that bears asking. Here are some good things happening to a market that others (and, by the way SGI) have said was "dead." (Do not flame me for that, SGI states in their Q3 1999 report that "The Company believes that the decline in the UNIX workstation and vector supercomputer markets are
long-term trends...")
But that was under Belluzzo. To quote "Hopper;" "Oh I see, under new management."
Things have been quiet at SGI since they changed CEOs, at least from an outsider's perspective. Maybe they'll get noisy again in a good way.
It has to be said... (Score:1)
How much is that in Beowulfs? (Score:1)
So how many Linux'ed Pentiums networked with Beowulf [beowulf.org] would be needed to give it a run for the money?
PGP (Score:1)
Or will it take 20 billion years for even these to crack something?
Please enlighten me, I lack knowings.
Linux involvement is possible for the front-end (Score:3)
Thanks
Bruce
G4 Beowulfs (Score:3)
That's $1600 each, or $16,000,000 per tens of teraflops.
It may be cheaper on PIIIs, but it would also take more PIIIs as well.
I'm assuming it's a headless network at $1600 each, btw.
-AS
NSA... (Score:1)
"In addition to critical government applications..."... this doesn't sound good.
echelon: FBI CIA NSA IRS ATF BATF Malcolm X
Militia Gun Handgun MILGOV Assault Rifle Terrorism Bomb
So what lenght keys do you have then? (Score:1)
Me? oh i use a 4096bit DH-key(the world can always be invaded by aliens that have a ray-gun that can turn entire galaxies into large computron axelerators that connect in near infinite number of dimensions |) ).
LINUX stands for: Linux Inux Nux Ux X
Click me, read me (Score:3)
Cheers,
ZicoKnows@hotmail.com
Re:So what the heck is a vector based computer any (Score:2)
Bruce
Well, I fell off my chair, but it makes sense... (Score:4)
Parallel machines, such as the Cray T3E, IBM SP2+, and to a lesser extent Beowulf clusters just give so many more Gflops/$. But as has been pointed out they are completely unsuited for some problems, for which you simply need all you power concentrated in a small number of CPU's.
My guess is that there is not enough market for a new US vector supercomputer, and the US government are stepping in so not to become dependent on imported hardware. If most SV1 installations are government, it might explain why we've heard so little about them.
BTW, many older supercomputers were single-user machines, which required a front end running a mutli-user operating system to schedule jobs. However all recent machines, including the C90, T90, T3E and I suspect the SV1 and SV2 run their own operating system (they are self hosting). In this case it is UNICOS, a Cray Unix which is gradually being merged with SGI IRIX.
I seems like a good thing. (Score:1)
Re:Linux involvement is possible for the front-end (Score:2)
The original Cray 1 and 1/S were like this, being parasitic on an IBM or VAX host. Modern supercomputers (Cray, Fujitsu, NEC,
Similarly, the first Cray parallel machine, the T3D, parasited - off the back of a Cray vector machine, J90 or C90. The T3E however devotes some of its 21264's to being I/O processors.
Vector architecture - it's all in the memory... (Score:3)
These computers are designed for one type of calculation - matrix algebra, which requires simple processing (multiply, add, invert) on enormous 2D arrays of numbers. The important point: the working set of these problems is often the size of the matrices, so caches are ineffective. Cray vector machines do not have ANY data cache between the vector units and main memory.
The feature of most vector machines that no-one has really pointed out yet is the way the memory system keeps those vector units fed. Unlike a microporcessor which relies on 2 or more layers of cache, the vector machine is quite capable of streaming data from memory fast enough to keep the processor 100% busy.
The vector instruction results in a number of vector fetches being issued to the memory controller, which is told to fetch a strided vector from memory (start at address x, every ith word, n words in total). The memory controller issues requests to individual banks in an overlapping fashion. Like the vector FPU, after it gets over its latency it starts banging the words out once per clock cycle.
The way this is done is to have a huge interleave in the memory - Cray T90's use either 1024 or 2048 banks. So long as any one bank is not hit more than one per 100ns or so (the cycle time of the memory) the memory controller is capable of delivering multiple streams of 64 bit words at full clock speed (T90's can have up to 32 CPU's). Typically, to ensure this, arrays in programs are structured to stride along the first dimension (FORTRAN remember) or where this isn't possible, the array dimensions are chosen to be prime numbers.
The thing that sets these machines apart is not processor speed - the peak speed in MFlops on an NEC SX-5 is only about 2-3 times that of an Alpha AXP. The thing that makes them special is the memory bandwidth to sustain that performance.
To use my favourite automotive analogy, if a PC is a small hatchback, then a supercomputer is an 18-wheeler, not a Ferrari.
Cost of a Cray SV2? (Score:2)
If you use a G4 PowerMac and their highly advertised 1 gflop rating as a base, at $1,600 each, to reach a 10 tflop rating you would need 10,000 networked machines, so for each 10 tflops would be spending $16,000,000. I don't know how the comparable PIII or Celeron would perform though, or at what price.
16 mil is a lot to spend. But for government purposes it may just be a drop in the bucket.
-AS
386sx-16 Beowulfs (Score:1)
It's a national security issue... (Score:2)
First it was blocking the NCAR deal, and now this.
Offtopic: Cray as a personality (Score:1)
I don't think we'll see anyone quite like that again.
Re:actually... (Score:1)
Argh. Ignore and don't keep responding! (Score:1)
Sigh.
*breath*
*breath*
*breath*
-AS
Re:Of course =) (Score:1)
From the unclassified bits of information available on the NSA, they are very interested in doing sophisticated statistical analysis on encrypted data. I suspect that is the main task of the vector supercomputers.
The NSA is the biggest employer of mathematicians in the USA. They have probably developed techniques of attacking ciphers that are considerably better than brute force attacks.
Re:I seems like a good thing. (Score:1)