SGI Installs First Itanium Cluster At OSC 198
Troy Baer writes: "SGI and the Ohio Supercomputer Center (OSC) have announced the installation of the first cluster using Itanium processors. The system consists of 73 SGI 750 nodes, each with two Itanium 733MHz procs and 4GB of memory, connected by Myrinet 2000 and Ethernet. Software includes Linux/ia64, SGI's ia64 compiler suite, MPICH/ch_gm, OpenPBS, and Maui Scheduler."
Intel fighting the "megahurtz myth"? (Score:1)
Of course Intel has the big advantage that if person X doesn't go for 733 Mhz Itanium or even y * 733 Mhz Itaniums, they'll still go for 1.8 GHz Intel P4. I.e. they don't have to sell the Itanium to consumers yet, whereas Apple has to sell the PowerPC. I.e. Intel, like Microsoft, can win by stagnation.
More information (Score:1)
Seti@home (Score:2, Funny)
Interesting (Score:1, Funny)
One only wonders if this would be posted on Slashdot if the OS was Microsoft Windows 2000 Datacenter. :) Just a thought...
Re:Interesting (Score:3, Informative)
It wouldn't be worth mentioning, since you can't cluster more than four nodes with W2k Datacenter. When you compare that to this cluster of 70+...
How Much Power and A/C for this Itanium Cluster? (Score:1, Interesting)
Re:How Much Power and A/C for this Itanium Cluster (Score:1)
OSC's Itanium cluster is physically located at the State of Ohio Computing Center (SOCC), which provides all the power and cooling. The power for the cluster is nineteen 30A/220V circuits, which go to a set of Sentry power controllers. The total heat load is about 195k BTU/hr, but the SOCC building is (over-)engineered such that this much of a heat load did not require additional AC capacity.
Wrong logo, Wrong idea (Score:3, Interesting)
SiliconGraphics has left the building. The "hip new" SGI is here. Quit using the old logo, it reflects a much cooler company that no longer exists.
Re:Wrong logo, Wrong idea (Score:3, Informative)
Re:Wrong logo, Wrong idea (Score:1)
I think SGI has realized the folly of its full-scale leap into Intel and is finally starting to get back to their roots again.
e;
Re:Wrong logo, Wrong idea (Score:4, Interesting)
http://oss.sgi.com/projects/linux-scalability/d
It's a link to a running *single image* (i.e. not a beowulf cluster) 128 proc Linux system on a mips box. When was the last time you saw Dell, HP, Gateway, do that with a Linux system? This little cluster in the post is not where SGI is going with their systems (as you said anybody can do that) but are moving with Intel numa with high speed numalink interconnects that are much faster than standard Myranet (their cross bar is in the gigaBYTES).
The real interesting part that I see (if they can live long enough for Intel to release Mercede) is the system partitioning and it's modularization. Need 2 more procs for your database but all your CPU slots are full, well plug in another "C brick", you won't have to worry about running out of CPU slots in your frame, everything is a component, you won't have to do another fork lift upgrade. Also with their partitioning I can purchase a 100 proc system, partition it into twenty 5 proc systems all within the same frame I don't have to pay for all of the overhead for 20 different frames and space for expansion in 20 of those frames, because all I need to do is plug in another "C brick" and give two of the boxes two more CPU's without ever having to have bought the headroom to begin with.
Whoa getting a bit long, SGI really has some cool stuff going on right now. If they could only market themselves out of a paper bag they wouldn't be in the bled-dry situation they are in. Personally I think the best thing for them would be to be bought by someone with big pockets who can market a product properly.
Re:Wrong logo, Wrong idea (Score:1)
dopp
Re:Wrong logo, Wrong idea (Score:2)
Anyway yeah, you have to buy the I-brick, P-brick & C-brick to get a system running and each system will need to have at least an I & P brick; but to add to an existing system no problems.
Re:Wrong logo, Wrong idea (Score:2)
Re:Wrong logo, Wrong idea (Score:2)
5.86 Gflops per processor (Score:5, Informative)
Unfortunately, this seems to mark Intel's latest attempt to push an overpriced, substandard product at us. The P4 was crippled from the begining and is only just now begining to show any promise. The PIII at 1.13 and 1.2 Ghz is finally available 8 months after the recall of their failed 1.13 processor. Even their purchase of Alpha from compaq seems to be just stock propping because the original creators of the alpha are now working for AMD. The reason Compaq was willing to sell in the first place is the second generation alpha has been subjected to over three years of delays because they simply did not have the engineering talent to improve a ten year old design.
The talented engineeers are working for AMD, built the athlon and are working on the sledgehammer.
Before anyone jumps to Intel's defence, like they need defending as long as they are the 800 pound gorilla, keep this in mind:
Craig Barret warned "This was a year of record annual revenue and earnings; yet, slowing economic conditions impacted fourth quarter growth and are causing near-termuncertainty,". He was faced with AMD going from 10% market share to 34% market share in a year. Wall street took barret's word as gospel that the entire market was in decline and not just Intel's market share. Intel is a market bellwether so we all got laid off. Just so Intel would not have to admit that AMD had a better product. Nasty business. Intel does not have a great product and they are reckless with their power.
Re:5.86 Gflops per processor (Score:3, Interesting)
Re:5.86 Gflops per processor (Score:1)
No, 11.8 really. (Score:1)
Actually, better G4 benchmarks [blacklablinux.com] are at Black Lab Linux.
Bullshit alert (Score:3, Insightful)
Pentium 4 has an absolutely pathetic floating point performance. Even Pentium 3 at 1000MHz outperforms Pentium 4 at 1500MHz on floating point. See here [tomshardware.com] for example. Your claim that Pentium 4 can do 3 floating point operations per clock cycle is nothing more than pulling numbers out of your ass. (unless you can somehow substantiate your ridiculous claim).
The P4 looses to the Athlon simply by the reason that the compilers can not use the vector instructions properly.
AMD has never had code optimized for their CPUs. They have always fought an uphill battle. Yet they managed to beat the crap out of intel in absolute performance (price/performance they had for a long time). The whole compiler crap is a strawman's argument. AMD has 3Dnow instructions which nobody uses. If current software was optimized for AMD, P4 would look even more pathetic.
Why anyone would be an Itanium instead of a dual P4/Athlon beats me.
Uhhh, perhaps because there is no such thing as dual P4?
It has less on-chip cache than a Celeron (128kb total)!! Sure it's packaged with a lot of sram, but still.
I don't know how to break it to you, but 1) Celeron has exactly 128KB L2 cache, and 2) SRAM stands for Static RAM, which is used for cache (as opposed to Dynamic RAM, which is used for the main memory).
Re:Bullshit alert (Score:1)
x87 FPU, sure. But that was only on for backwards compatability. Once SSE2 turned on (which, unlike the Athlon, is contained in a seperate unit on the processor and does not run in the FPU), the Pentium 4's FPU becomes a force to be reckoned with (surely you saw the SPEC scores?). As the original poster noted, the problem is most current compilers can't make proper SSE2 code if at all (it'd be like using a compiler optimized for 386 on your Pentium).
If current software was optimized for AMD, P4 would look even more pathetic.
Completely off-based. The Athlon 4 uses SSE, and rumor has it the upcoming chips will use SSE2. If people optimized for the new Athlons, the Pentium 3/4 also stand to gain a great deal.
Uhhh, perhaps because there is no such thing as dual P4?
They're called Intel Xeons. While it's true that it's not technically a "Pentium 4" by name, it is a P4.
Re:Bullshit alert (Score:2)
x87 FPU, sure. But that was only on for backwards compatability. Once SSE2 turned on (which, unlike the Athlon, is contained in a seperate unit on the processor and does not run in the FPU), the Pentium 4's FPU becomes a force to be reckoned with (surely you saw the SPEC scores?)
That's like saying Windows supports win32 binaries only for backwards compatibility. P4's FPU really stinks and now all of a sudden intel wants everybody to switch to SSE. Well, sure it will help (and, as you noted, it will also help Athlon which currently supports SSE, and future versions will support SSE2 as well), but high-precision floating point arithmetic is impossible with SSE, so x87 FPU is not going to go away. As for having x87 FPU and SSE in separate ALUs, well, that won't really matter if, as you claim, everybody will up and switch to SSE, will it? So my claim stands: P4's FPU is pathetic. If software was optimized for P4 and Athlon, P4 would still be pathetic. If future software uses SSE2, Athlon's performance would improve just as much as that of P4. And, furthermore: using the "compiler is not able to take advantage of some wiz-bang feature" is a strawman's argument. I repeat: AMD has always fought an uphill battle and has never had software optimized specifically for their CPUs, yet they still managed to beat the crap out of intel.
They're called Intel Xeons. While it's true that it's not technically a "Pentium 4" by name, it is a P4
A Xeon is based on P3 core. Xeon is just an overpriced P3 with a larger L2 cache. My claim stands: there is no such thing as dual P4. If and when Xeons with P4 core are relased, then we'll talk.
You want numbers, here're some numbers... (Score:1)
I was hoping to avoid this pissing match, but these claims are too ridiculous to let pass.
If software was optimized for P4 and Athlon, P4 would still be pathetic.
The NAS Parallel Benchmarks [osc.edu] would seem to indicate you are wrong.
A Xeon is based on P3 core. Xeon is just an overpriced P3 with a larger L2 cache. My claim stands: there is no such thing as dual P4. If and when Xeons with P4 core are relased, then we'll talk.
P4 Xeons have been available for several weeks now; we've had a dual P4 machine on site for about a month. Here's the /proc/cpuinfo from it:
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 0
model name : Intel(R) Xeon(TM) CPU 1700MHz
stepping : 10
cpu MHz : 1695.874
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss tm
bogomips : 3381.65
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 0
model name : Intel(R) Xeon(TM) CPU 1700MHz
stepping : 10
cpu MHz : 1695.874
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss tm
bogomips : 3381.65
You really ought to check your facts...
These numbers are meaningless (Score:2)
OK, I was not aware of that, but tell me where I can actually buy them. I looked at Dell's site before I posted. If Dell doesn't have them, nobody else does.
The NAS Parallel Benchmarks would seem to indicate you are wrong.
You pointed to the single (type of) benchmark that makes P4 look good, though not because of P4's virtues. P4 has two channels of RDRAM giving it 1600 * 2 = 3200 MB/s of memory bandwidth. The Athlon machine in the benchmark you pointed to had a single channel of DDR DRAM, giving it 2128MB/s of memory bandwidth. Fluid Dynamic simulation is one of the _few_ things that can really take advantage of greater memory bandwidth. Therefore P4 wins this round by riding on Rambus's coat-tails. Prove me wrong. Point me to the same benchmark where both P4 and Athlon have the same type of memory (both support SDRAM now, you know).
Re:Why should I _care_ about having the same memor (Score:2)
That's just plain not true. The P4 memory bus is 32 bit wide (16 bits for each RDRAM channel). Athlon memory bus is 64 bit wide (one DDR DRAM channel). You are probably thinking of the width of the L2 cache datapath, which is 256 bit for P3 and P4 (I don't recall what it is for Athlon).
Or that I can't get an optimizing compiler for the Athlon that's comparable to the Intel Fortran compiler?
I think I said it 3 times already, but I'll repeat it again for posterity. AMD has never had software optimized for their CPUs. They always fought an uphill battle. Yet they still managed to beat the shit out of intel.
Look, I don't have the luxury of caring which processor is better in a "fair" test with the same memory, etc. -- my job is to figure out which system (processor/memory/IO/etc.) is fastest for our users' applications
That's fine and that's a reasonable thing to do. But you cannot use your benchmarks to claim that P4 is faster than Athlon. You benchmarks show that P4 with RDRAM is faster than Athlon with DDR DRAM. This does not imply that P4 is faster than Athlon, which is what you were trying to claim. Also, you cannot make such a claim based on only one benchmark.
Once again, I have no problem with people trying to find the best system for their needs. I have a problem with unsubstantiated claims.
BTW, I suspect that for your particular application, memory bandwidth matters more than the CPU speed. So you may actually be benchmarking memory instead of CPU.
Re:Why should I _care_ about having the same memor (Score:1)
But you cannot use your benchmarks to claim that P4 is faster than Athlon. Your benchmarks show that P4 with RDRAM is faster than Athlon with DDR DRAM. This does not imply that P4 is faster than Athlon, which is what you were trying to claim.
Actually, I was providing a counter-example to a specific comment of yours:
I can get Athlon-specific optimizations if I use the Portland Group compilers (-tp athlon -Mvect=prefetch), so claiming that "AMD has never had software optimized for their CPUs" is not entirely true. Certainly they're not well developed as the P4 optimizations in Intel's compilers, but they do exist.Also, you cannot make such a claim based on only one benchmark.
I have a hard time viewing the seven different codes in the NAS Parallel Benchmarks as "one benchmark"; the algorithms in the CG (conjugate gradient) benchmark for instance are very different from those in FT (Fourier transform) or IS (integer sort), and they stress different parts of the architecture. They do all favor high memory bandwidth systems, but that's the nature of the beast in HPC -- almost all scientific applications are memory bandwidth hungry.
Re:Why should I _care_ about having the same memor (Score:2)
They do all favor high memory bandwidth systems, but that's the nature of the beast in HPC -- almost all scientific applications are memory bandwidth hungry.
Who cares if the benchmarks use different algorithms? They all test exactly the same thing: memory bandwidth. This statement of yours actually makes my point above stronger. You admit that all of these benchmarks are highly dependent on memory bandwidth. It appears that you agree with my supposition that these benchmarks depend more on the memory bandwidth than the CPU speed (integer sort for example would be 100% memory I/O). Yet you are still trying to claim that P4 is faster than Athlon. These benchmarks do not prove that. They prove that the bandwidth of two channels of RDRAM is greater than that of one channel of DDR DRAM, but we kind of knew that already, didn't we?
So, your counterexample is invalid and my claim stands. I really think that we are starting to go around in circles.
Re:Why should I _care_ about having the same memor (Score:2)
Re:Bullshit alert (Score:1)
So does AMD, apparently. Maybe that should tell you something? Like people are sick of the stack-based x87 FPU?
If software was optimized for P4 and Athlon, P4 would still be pathetic.
Go to Tom's Hardware and read up on the FlasK fiasco, your claims are way off base.
A Xeon is based on P3 core. Xeon is just an overpriced P3 with a larger L2 cache. My claim stands: there is no such thing as dual P4. If and when Xeons with P4 core are relased, then we'll talk. ;)
A Pentium III Xeon is based on the P3 core. A 'Xeon' (full name), is based on the Pentium 4. I think this little argument right here invalidates your whole damn argument.
the difference between the elite and the l33t (Score:1, Funny)
Myrinet interfaced via PCI? Argh! SN-IA won't be here soon enough!
l33t d00d with an overclocked athlon:
SCHWEEEET, a BEOWULF CLUSTER! With an IDE RAID on each node I could have years of DiVX movies on that!!!
woohoo (Score:1)
bah (Score:2)
SGI has been faithful at least (Score:4, Interesting)
I'm glad there are still big players in the Linux field, though, it helps forward the cause and the OS and lets people know there IS an alternative. By all means, SUN and other, keep your propriatary stuff available and have that as the default, but allow people the option to choose another OS if they so desire.
DanH
Slashdot (Score:2, Insightful)
http://dailynews.yahoo.com/h/nm/20010809/tc/tec
Has been out for all most *five* hours. The story talks about how Linux is going to be used as the OS for the *biggest* *cluster* *of* *super* *computers* *in* *history*
It is the greatest news I have heard in months and it "matters" if you ask me. The Super computer(s) will be funded by the National Science Foundation(NSF) and it is reported that the super computer(s) will be able to calculate in one second, what it would take a hand calculator ten millions years to calculate. In addition the total disk space will be enough to hold all most one and a half million full-length novels.
In other words, the Linux OS is going to be used for the largest computing grid in the history of the world.
This story has been availavle on Yahoo!(TM) LinuxToday, Newsforge etc for hours. I submitted the story 3 hours ago and nothing...
I used to read slashdot for the news and told myself I could ignore the mindless trolling and moronic comments, now I realize the news service is garbage and I have no reason to read
Looks like it is newsforge or LinuxToday for me
:-)
Re:Slashdot (Score:3, Interesting)
The new systems sound great, but they're tiny compared to what it's going to be like when the GRID is up and running.
What the story fails to mention is that this system is likely to be connected to the other GRID environments in the States and the new ones in Europe at which stage you wont be talking about just 4 super computer centres but nearly a hundred, each with several Tflops of processing power and a few petabytes (10**15) of storage.
I would suggest that to put this in the proper perspective you also look at IBM's contract to do the same to 4 sites in the Netherlands, the UK GRID which has 9 sites, the German one which I dont know much about but is fairly advanced and the CERN DataGrid. These are all interconnected with the same people working on several at a time.
Or you could have a look at the top500, find all the supercomputers in Europe & the US which aren't classified or commercial and then figure out what their combined processing power is. You should then have a fair idea just how much processing power there will be in a couple of years time 8)
Now back to the Particle Physics experiments
geez.. sounds like your describing Deep Thought (Score:2, Funny)
We can ask it for the answer to the great question of the universe?
Re:Slashdot (Score:2, Insightful)
Reading
IA64s are kickass... (Score:5, Interesting)
The machines also had 4GB of ram, so it was fun to do:
char * myStr = (char *)malloc(-1);
and have it succeed! (that's a 4GB memory allocation)
two words: (Score:2)
IA64 has pretty clean and easy asm, but debugging is a complete nightmare. It's certainly better than x86, but give me MIPS64 or MIPS R1X000 any day.
Re:two words: (Score:2)
That's kind of a pain (and at least the IA64 can be forced to do percise exceptions). Debugging something that the compiler has software pipelined will be a giant pain. Your source code will look like a simple loop, but the machine state will show you in the several times (i=7, 8, 9, 10 all at once...but not quite as far into the loop for some values).
Re:IA64s are kickass... (Score:1)
Re:IA64s are kickass... (Score:1)
codeing for ia64 in asm? (Score:1)
Re:IA64s are kickass... (Score:2, Interesting)
Re:IA64s are kickass... (Score:2, Interesting)
This was one example of low level optimizations, another one is giving hints to different branches (both target and outcome of branch conditions). This is also best done by the compiler (at least the branch target hints), and works even better if you can supply the compiler with profiling information. You can also give data prefetch hints and specify which cache level different prefetch data should go into.
Another example of when you might need to do asm is when you do SMP. The reason being that different load and store instructions are given semantics of how the are to behave in a multiprocessor environment: you want acuire semantics on this load, release semantics on this store, fence semantics here, undefined semantics there, etc. I can't see how the compiler would be able to generate correct assembly in this case (unless it is modified so that you can attach some new attributes to your variables and types).
Then there is this whole plethora of floating point stuff that I won't mention because I don't know shit about it.
Hmm, reading your post again I see that I didn't really answer your question, and most of my ranting about doing asm coding ended up with the conclusion that having the compiler do all the nasty stuff is probably better anyway. I guess I'd better shut up now.
Re:IA64s are kickass... (Score:1)
1) The compilers are MUCH more complex and therefore have complex bugs when compiling complex loops
2) C/C++ are not good enough languages to express enough hints to the compiler to allow it to optimize properly.
3) Therefore, C/C++ require non-portable language extensions for these hints.
4) Advanced usage of C++ inline templates CAN help, but are still not as flexible as you want in terms of allowing the best re-ordering of code possible.
For example, you HAVE to use the new 'restrict' keyword to give the compiler hints about un-aliased pointers. pragmas or similiar hacks are required to give the compiler hints about how many times a loop will be executed - for example, telling the compiler that although a loop is repeated based on a separate run-time variable or parameter, it is guaranteed to never be 0 and will always be a multiple of 8 loops. That kind of thing allows the compiler to really do a good job on the loops - but is not part of the C/C++ standards.
In my opinion, C and C++ are too 'low level' and restrictive in terms of allowing the compiler to re-order the code to be really effective on non-typical architectures like these. Unfortunately there doesn't seem to be any other options. I would expect that a functional language like OCaml would allow the compiler much more freedom to do a better job - Once there is real support in OCaml for the ia64....
Also, I highly doubt that GCC is going to be the best or even a very good compiler for ia64 for some time. I'll be happy if i'm wrong though!
My opinion is that you did the right thing by diving into the assembly.
--jeff
Re:IA64s are kickass... (Score:1)
C is very very limited... For example, the single most important thing when coding math is using the carry flag, and there is NO WAY to use the carry flag in C... For example, to add 2 32-bit values together in C:
boolean add32 ( UINT32 a, UINT32 b, UINT32 &res)
{
*res = a + b;
return ( *res > a );
}
whereas in assembly, you can just use the flag... This results in **at least** twice the speed of your application, just for adds... C is way too inspecific to be used effectively for math routines. That's why I had to resort to assembler. Even on the kickass IA64 with wacky crazy assembler that's very very wierd to use, and we are told by intel that "you should never need to use assembler, we have spent millions of dollars optimizing our compilers for it!", they specifically instructed us to NOT USE ASSEMBLER in our ports of our crypto code, citing that it'd be faster to use C... Well I did that, and then later on ported it to assembler myself, by hand, and showed them that the asm code was 3x faster than the C code... Why? Just because of the things that I could access in ASM that I couldn't access in C. The people at intel were not too happy about this, and couldn't understand why it'd be faster. (sigh). Oh well... (end rant)
Re:IA64s are kickass... (Score:1)
For example in C and C++ I end up rewriting the standard biquad filter not just for each DSP architecture but for each way that the data would be organized. And the math is the same (and pretty simple) in each way. I feel that this is exactly the case where the compiler could do the re-organization for me. But the compiler needs to know a higher level concept of what I am trying to do - So it could not only run multiple iterations of a loop in parallel - it should also be able to interlace the instructions that comprise two or more function calls with each other (that are allowed to be done in parallel) in order to fill up the pipeline.
--jeff
Literally (Score:1)
Re:IA64s are kickass... (Score:2)
You're going back to school, m'laddo.
malloc(3) has overhead. You should have expected that to fail if you were thinking only of how much RAM you had. Fortunately, modren cowmpooterz has virtual memory. So you could have got away with that when you got your first 5GB hard drive.
--Blair
Re:IA64s are kickass... (Score:2)
Re:IA64s are kickass... (Score:2)
malloc(3) has a job to do, and that's to keep you from using memory you can't use. If it doesn't "allocate" it until you access it, then whatever system you're using is incontrovertibly broken.
malloc(3) is at least two layers above the actual memory. Plenty of room for the MMS and MMU to translate it.
Don't go assuming what I was running.
And if Linux limits memory to 3GB (which I did not know, as I don't often dig around the internals of toy computers) then what good is it?
--Blair
Re:IA64s are kickass... (Score:2)
A 32-bit CPU can address 2^32 bytes of virtual memory just fine.
--Blair
0xffffffffffffffff
Humm (Score:1, Funny)
in related news (Score:3, Informative)
Re:in related news (Score:2)
"creating the 13.6-teraflops system--the most powerful distributed computing system ever"
OK, what exactly does that make seti@home? 16.98 TeraFLOPs/sec as I write this, distributed computing system, scientific research... What am I missing?
Re:in related news (Score:1)
computing power (Score:1)
a team of about 4-5 guys trying to keep the network up and running because of stupid users.
let's just hope this megacluster will be put to good use and not be the target of some bored hacker who installs a counter-strike server or whatever.
*believes in the good of humanity*
a dialogue (Score:4, Funny)
this company what I read about not half an
hour ago on this very website.
Me: Oh yes, the, uh, the Workstation manufacturer...What's,uh...What's
wrong with it?
S: I'll tell you what's wrong with it, my lad. it's dead,
that's what's wrong with it!
M: No, no, it's uh,...it's resting.
S: Look, matey, I know a dead company when I see one, and
I'm looking at one right now.
M: No no it's not dead, it's, it's restin'! Remarkable company,
the SGI, idn'it, ay? Powerfull CPUs!
S: The CPUs don't enter into it. It's stone dead.
M: Nononono, no, no! It's resting!
S: All right then, if he's restin', I'll wake him up!
(shouting)
'Ello, Mister Bob Bishop! I've got a lovely fresh government
contract for you if you show...
M: There, it moved!
S: No, it didn't, that was you faking a press release!
M: I never!!
S: Yes, you did!
M: I never, never did anything...
S: (yelling and hitting the cage repeatedly) 'ELLO SGI!!!!!
Testing! Testing! Testing! Testing! This is your nine
o'clock alarm call!
See, guys, I told you they still had life left!
-Chris
It is worth noting... (Score:3, Interesting)
I knew Troy from school, admin-ed with him in the Ohio State engineering labs. Ask him what he's doing with that Aero Eng. diploma nowadays..
-'fester
I'll swap you (Score:2)
You can come admin these clusters and I'll go work with your Alphas and Origin 2000.
Not exactly... (Re:It is worth noting...) (Score:2, Insightful)
The person who submitted the story (Troy Baer) is also the admin of the beast.
To give credit where credit is due, the admin of that system is Doug Johnson, who has done an enormous amount of work to get this thing working. I'm just a user support guy who writes lots of documentation and happens to dabble in systems stuff like Maui and PVFS in my Copious Spare Time[tm].
Not a bad deal (Score:4, Funny)
"Damn. I asked for an iMac, but got this stupid Linux cluster instead!"
;^)
Cheers,
Jim in Tokyo
Compiler? (Score:1)
Thanks.
Re:Compiler? (Score:1)
Preferably use multi-treaded/multiprocessors sources.
Run them both and see how the SGI compiled one is much more faster and efficient...
Mod Timothy redundant! It's been done folks.. (Score:2)
Also read The story at NCSA [uiuc.edu] if that's not enough for you.
SGI Sucks ( read on ) (Score:5, Insightful)
SGI sucks.
Most of their hardware is great, as is most of their software. But their head is completely up their ass these days.
Stagnant desktop machines. Impressive but overpriced big iron. OEM PCs. And a terrible logo. What went wrong? Where to begin??
Once upon a time there was a company called Silicon Graphics. They got their start by making wickedly powerful terminals to provide 2D and 3D graphical front end to massive minicomputers and supercomputers. Mind you this was two years before Apple introduced the Macintosh and Xerox was still playing with the underpowered Star. Shortly there after they began selling a line of large rackmount, standalone graphical computers that used multiple large boards covered with cpus, fast ram, and other goodies to churn out decent primitive 3D in real time using the GL framework (later called IRISgl, which eventually became OpenGL). This was about the time your dad upgraded from a C64 to a IBM XT.
Fast forward to 1995. You and I were probably playing with a Pentium 100 and looking forward to the rumored 3Dfx Voodoo card. In that same year, SGI upgraded their Onxy graphical supercomputers to InfinteReality graphics... providing performance on par with a Geforce 256. Except the IR could handle 64 MB of dedicated texture ram and 320 MB of frame buffer. Three IR "pipes" could be installed in a single system, and each pipe could even be broken down to multiple channels. IR allowed the world of graphical simulation to finally approach photorealistic quality with multiple projectors / monitors providing a wrap-around display (keep in mind that much of this was available on a limited scale 1991 with SGI's RealityEngine pipes). Both the Onyx and SGI's non graphical server, the Challenge, received a CPU upgrade. Up to 24 MIPS R10000 CPUs running at 195 MHz (each providing 390 MFLOPS + 390 MIPS) could be installed in the Onyx. The Challenge could take up to 36. SGI's flagship desktop machine, the Indigo2, received upgrades as well. The top of the line model had an R10K/195 CPU, up to 640 MB of interleaved ram, two channels of SCSI, and Maximum Impact graphics (4 MB of dedicated texture ram, 27 MB of framebuffer, and performance somewhere around that of the TNT2).
SGI's machines continued to get better. Indigo2 was replaced with the Octane. Onyx and Challenge were replaced with the Onyx2 and Origin, and later with the Onyx 3000 and Origin 3000.
Here we are in the middle of 2001. SiliconGraphics has become "sgi" with a NYSE stock price below $1. Their O2 desktop machine hasn't changed much since 1996, and aside from the new gfx card and faster CPUs, the Octane2 isn't a whole lot different than the original Octane in 1997. Onyx 3000 uses updated graphics based on the original IR from 1995. Perhaps the only noteworthy change has been the architecture of the new Onyx and Origin. Both can scale as a single machine to 512 CPUs with 1 Terabyte of RAM. Many of these massive machines can be clustered together for even more power... at an insane cost.
The company that brought us 3D on the desktop has pretty much come to a halt. Their desktop machines haven't change much in almost 5 years. Their big iron is impressive, but expensive as all hell. And their PCs... where to begin on the PCs... They tried making what could have been the coolest pair of PCs of all time. But due to delays and driver issues, the machines ended up being overpriced, nonupgradable ho-hum boxes. Pretty soon they hit the other end of the spectrum with generic OEM PCs. And now this, the "SGI 750" Itanium. A box that is identical to that which is being sold by HP and Dell. The only thing SGI about it is the logo. We're not even dealing with the same SGI. This new "sgi" couldn't have possibly come from the same roots as the old, grand, SiliconGraphics.
I can't help but wonder what the old SiliconGraphics would be doing today. Like another poster pointed out, the Octane would probably have an ever faster architecture, better graphics, and probably 4x the CPU power. This new linux cluster would probably be based on much better machines and using something better than Myrinet (which is limited by the 66MHz/64bit PCI bus the card sits in). The old SGI would have made a complete fire breather, not some OEM stack that anyone could build themselves. The old SGI would have the cube logo *and* rightfully wear it.
When I look inside my old, used Indigo2 from 1995 what do I see? I see its 750 watt power supply. I see not a graphics card, but *three* massive cards working together and connected to the power supply via a thick jumper cable. I see engineering at its best. I see a product that pushed the limits of silicon and interconnects. I see something that was worth its $50,000 pricetag. I see something that was indeed an order of magnitude more powerful than anything else on the desktop.
When I look at the current SGI desktop machines, I see something I can buy for less at Best Buy.
I recently saw a demonstration of the Onyx 3000. One of the demos was a visualization app used by an automobile maker. The app showed a few different cars in full detail across three screens (each 1280x1024) in a panoramic configuration at a sustained, locked 75 Hz + 75 FPS. The cars had complete reflection features that interacted right down to the metallic flecks in the paint. The detail was right down to the 3D textures that made up the subtle surface of the dash plastics and the seat leather. It was truly photorealistic. I've seen the Geforce 3 demos, they were nowhere near as impressive as the car demo.
Another demonstration showed the Onyx's power at loading textures. The machine they had was connected to several RAIDs containing over 500 GB of satellite and aerial photos. On the same three screens and in the same 75 Hz + 75 FPS were able to zoom down to a national park, pan across to another state, and zoom back out to planet Earth floating in space. All in real time. The RAIDs were clattering so loud I could hardly hear the man giving the demonstration. The Onyx never missed a beat.
If the old SGI was here today, we'd have that kind of power on the desktop. And it would cost $50,000 and consume 750 watts. Not $500,000 and 9,000 watts.
And we wouldn't have a Myrinet connected stack of Itanium PCs. We'd have something a whole hellofa lot better.
[end rant]
Re:SGI Sucks ( read on ) (Score:2, Interesting)
Sorry, but there's nothing overpriced about the Origin 3000 family. I saw a quote for a 16-processor O3400 with 16 GB of RAM; the bottom line at list price was around $500,000.
That seems totally competitive to me.
And since I'm posting anonymously anyway, you might be interested to know that SGI is planning to release a new product any minute now. (It was announced to the developer and integrator channels this week). It's a four-processor (MIPS, natch) server in a 2 RU package. They're calling it the Origin 300.
But the cool part is going to be an interconnect product codenamed Sprouter. It lets you take 2 to 8 Origin 300 systems with 4 procs each and connect them using NumaLink (formerly CrayLink) into a single system image of up to 32 processors.
At, it's projected, half the price of a 32-processor Origin 3000 system. And for my kind of programming anyway, single-system-image beats the pants off that Myrinet stuff.
The O300 has 2 66/64 PCI slots, so that's enough expansion to let you attach your basic I/O devices like fibre channel RAIDs and high-speed networking and stuff. Each server comes with USCSI3 built-in, if anybody still uses that stuff. ;-)
Not everybody needs a medium-scale single-system-image IRIX machine, but I personally do a lot of ImageVision library programming. And ImageVision, being multithreaded at the core, loves big CPU counts. So for me, and people with needs like mine, it's going to be a very cool fall.
Re:SGI Sucks ( read on ) (Score:1)
Re:SGI Sucks ( read on ) (Score:1)
Wither XIO? (Score:2)
Re:SGI Sucks ( read on ) (Score:1)
Now that SGI is about to be lost would *someone* please contact Dr. James H. Clark?!? I would hate to see this company going down just like Symbolics and Thinking Machines before.
Re:SGI Sucks ( read on ) (Score:2)
I placed my order straight away, and it still has not arrived. The reason? During this crisis period for the company, where everything is falling apart, they are apparently changing their sales systems to use the latest version of Oracle. During this transfer, their ordering system is entirely manual. They haven't lost my order (yet), but they did take many more orders than they had available machines for, and they claim to still be straightening out the mess.
I lost even more faith in SGI when I noticed their "configure your SGI system" web page was Windows-based. For shame, SGI!
Part of this is, of course, the kind of weird degradation that occurs at any big company. But I can't say I have any faith in the company that makes that kind of blunder. Changing your ordering systems to the extent that operations are severely impacted is just plain stupid; I've never managed a major business, but if I were, the minute I had problems of that severity, I'd go straight back to the old system and send developers of the new one back to the drawing board. Elementary, surely?
Oh, and even though I am surely a beneficiary of a great monitor deal because of this, they should have dumped all the monitors on eBay, ten or so at a time. They would have made, maybe $ 800 per monitor instead of $600. That's a pretty respectable chunk of change right there.
D
Re:SGI Sucks ( read on ) (Score:5, Insightful)
I know this is a rhetorical question, but having once spent a lot of time thinking about how to advise them before and during their fall, I'll give you my analysis. Some of this I saw at the time, some aspects I only saw too late. Learn from their mistakes.
Here's what went wrong:
P.S. I didn't even get into their server strategy, Cray, and later events. Another time perhaps.
Re:SGI Sucks ( read on ) (Score:2, Insightful)
I actually do not have any hope left for SGI. Here is why:
-Their workstations do not cover enough markets to sustain themseves let alone generate enough revenue to set them back ahead of the pack.
-Their single image clusters are not cost effective enough for the scientific comunity and their use for Visualisation, where people would be willing to pay the insane price, isn't large enough.
-Anybody can build UNIX servers.
-Anybody can build MPI clusters.
So I think they are reduced to niche markets which will not cover their cost in R&D which means they will have to license technology or buy components from others (CPUs, bussystems etc.). However they might run out of breath in the time they redesign their gear to take advantage of the new tech, and reduce the production cost.
So my guess is they will be consumed slowly by their competitors - lets only hope some of their tech remains.
Where it all went to hell (Score:1, Interesting)
The strategy was to go NT4.0 on commodity (intel) hardware. So, SGI announced that it was reducing Irix development, halting development of the MIPS processors (which were an order of magnitude faster than the Pentium of the day and 64-bit to boot).
Very quickly, he was looking for a new job, SGI had penium machines that no-one wanted running NT (and Linux for the better informed customer), had restarted MIPS development and continued with Irix. However, by that time they had lost their lead.
I hope he lost all his money in the dot com mania.
it's worse than that... (Score:4, Informative)
Where is Mr. Belluzzo today?
Hold on to your hat...
http://www.microsoft.com/presspass/exec/belluzzo/
fascinating (Score:2, Funny)
Belluzzo was formerly chief executive officer of Silicon Graphics Inc. (SGI), where he was responsible for defining and executing a return to growth and profitability for the company
Re:fascinating (Score:2)
Re:Where it all went to hell (Score:1)
Re:SGI Sucks ( read on ) (Score:2, Interesting)
However, I think the reason that SGI is not producing a desktop version of Onyx 3000 is obvious - SGI tried to do battle with Sun, etc. and failed. SGI tried to do battle with Dell, etc. and failed. They're not about to do the same thing to NVidia...
The original SGI targetted what was at the time a niche market - 3D graphics. It looks like the new SGI will also retreat into niche markets - very high-end graphics and compute servers.
You're also right about the Itanium server - there's nothing very interesting about it. I believe this machine is only intended as an interim solution to allow developers onto the platform until SN-IA is available (whenever that might be).
Then we'll see something a little more impressive than OEM Itanium boxes with low-bandwidth Myrinet interconnects!
15 years for MicroSoft to 64 bits? (Score:4, Insightful)
after those chips came out. Hope they are faster
this time. 32 bit NT on an Itanium would be a waste.
SGI and SUN have had full 64 bit OS for 7 & 5
years. Yes, there are bugs to shake out in the
beginning. OF course Bill & Steve will announce
they are "just about to ship" for years until
they do.
Re:15 years for MicroSoft to 64 bits? (Score:1)
Re:15 years for MicroSoft to 64 bits? (Score:2)
But hasn't there been a 64 bit version of Linux for a VERY long time - ie, for the Alpha?
What I find interesting (and it is just a personal conspiracy theory of mine, and probably holds zero water) is that it took "this much" time for Intel to release their 64 bit chip - how long have we been hearing about it now? At least 2 years...
Is it just a coincidence that Microsoft finally has a running platform for it just at the time this chip comes to market?
???
15 years is pretty good (Score:2)
Want innovation? Who do you think has it? Also, one company uses a UNIX based OS; the other at least tried to fix the mistakes of the past, although they did make more in the process of doing so.
Competitors Know the Itanium Sucks (Score:2, Funny)
Also of note, IBM has stated that current plans for the G5's SIMD/AltiVec engine specify a 256-bit system, rather than a 128-bit one in the G4. This will be one kickass CPU.
Itanium? (Score:3, Funny)
Stan: "Why did you call it 'Itanium'?"
Dr. Adams: "I have a rare marketing disease that prevents me from pronouncing the first 'T' in 'itanium.'"
Re:Itanium? (Score:1)
Hrm.... Malcolm X or Mac OS X?? Kinda messes with your head doesn't it?
IA64 & Myrinet (Score:2, Informative)
A interesting fact is that up until a few days ago, Myrinet only supported 1 GIG systems. I ran into this while setting up the University of Nevada [unr.edu] beowulf named cortex.
I must admit, IA64 with Myrinet 2000 is gonna kick some serious computational ass.
The article says that Myrinet will run MPI, but it will also run PVM and TCP/IP stuff too.
Check it out at their site [myricom.com]
OT: Music Box for IRIX? (Score:2)
Hey everybody,
I've had exactly two epiphanies in my life. One was the first time I saw an Amiga, back in 1985. I knew I would have one.
The other time, it was 1993 or 1994. I was installing a video projection system into a hotel meeting room. The computer to which I had to connect the video projector was a funky little purplish pizza-box with the name "Indy".
I fell in love.
I still don't have an old Indy or Indigo - or even any UNIX workstations - so far, I've kept my cravings in check with Solaris 8 x86, FreeBSD and Linux.
There was something running on that Indigo that day. I don't remember it very well, but I remember the name: Music Box. In the window, there was a picture of a woman, and cartoon musical notes would fall out of her mouth as eerie but beautiful music played in the background. For several hours while I aligned the video projectors in the room, it was on the screens, 10.5x17 foot image of an IRIX X session, with that eerie and hypnotic image and sound. It freaked out my boss, too. :)
It looked like something you'd just leave on the screen of your computer to amuse passers-by. Anyone have any idea what it was, or whether it's available for anything but MIPS/IRIX? A Google search didn't prove too helpful.
Re:OT: Music Box for IRIX? (Score:1)
Shut the fuck up (Score:1)
Re:Shut the fuck up (Score:2)
IT'S INDIGO FOR FSCK'S SAKE
Well, of course it is. I was building up to it, and revealing the name in the color of the box was rather self-defeating to the melodrama inherent in later naming it an Indy.
Re:OT: Music Box for IRIX? (Score:3, Informative)
Well, not exactly. The Indy is a blue pizza box. The Indigo2 was a much larger turquoise/grren box, and the Indigo2 Impact (which had tons of problems, BTW) was the same box in purple.
I should know - I have an Indy and an Indigo2 ;)
Re:obvious (Score:2, Funny)
a plumber? like this guy...? (Score:2)
Overclocker kiddies will do anything for an extra 8 frames-per-second.
Now here's a scary thought... how about the overclockers team up with cluster builders??
Re:Deus Ex!!! (Score:1)
MIPS is at a deadend (Score:2, Informative)
Re:ItaniuMMM--good!!! (Score:1)
Available to anybody with an account there? (Score:1)
NCSA has a slightly larger system of (more or less) the same design on order from IBM, and I think the installation is ongoing right now.
--Troy