Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Silicon Graphics

Specs On New SGI Onyx And Origin 154

An anonymous reader wrote in to tell us that SGI has announced their latest and greatest MIPS-based computers, the Onyx and Origin 3000 line. Up to 1 TB RAM and 512 processors, all on a single system (not a cluster). Beyond Boxes has a nice summary, too. This is definitely a great system for anyone who wants to have their computer be the size of several refrigerators ;)
This discussion has been archived. No new comments can be posted.

Specs On New SGI Onyx and Origin

Comments Filter:
  • Maybe this would be enough hardware to run a smooth version of the Daikatana demo! I figure something that large might be able to get that game up to about 20 fps.


    luckman


  • with the PowerMacCube with gigabitethernet and Mac OS X you'd have something wonderful that would make the Beowolf type architectures mindblowingly fast and amazingly small..

    Imaging a room full of these ----

    :-)
  • Actually, one of the benefits of the system's modular building blocks is that one rack can hold up to 32 processors. This is double the density of the o2k. The refrigerator comment is correct though. 4p in 4U is bound to kick off some heat.
  • The word noticably is what bothers me.

    Yeah, I know. After I posted it, I realised that noticably wasn't the right word. Perhaps measurably would have been better. The point I was trying to make (and I guess I didn't succeed very well) is not that you could use a cluster of off-the-shelf machines instead of an O3K, but that the O3K (and other large machines) had some cluster-like properties housed in a single case. Bandwidth and latencies may be orders of magnitude better, but architecturally, they're similar (although not identical).

  • All other companies high-end servers use symmetric multiprocessing

    Ermmm... no. Data General and Sequent have both been shipping NUMA boxen for many years now.

  • Check out the internal bandwidth of those boxes, 5 year old O2's and Indigo2's still have greater memory bandwidth than the latest pentium motherboards. I still have yet to see a multi processor athlon board. Voodoo5 cards are meant for two things.

    1. Getting high FPS on games
    2. Getting high benchmark scores

    SGI concentrates on realtime rendering and processing. Their video cards have dedicated mpeg/jpeg rendering engines and don't forget the cpus are 64 bit. Given the choice of a 1Ghz pIII or a 200Mhz Onyx, I would take the Onyx in a heartbeat.
  • Ah!

    That was the crucial detail I was missing.

    Thank you very much.
    ---
    pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
  • One thing I never understood is what the heck is a person to do with a 320mb frame buffer?

    I, personally, wouldn't have a use for one (other than bragging rights :-), but it's not actually that big. At 48bpp (16 each for RGB), you could get 3900x1792, an aspect ratio and resolution that may well be suitable for motion pictures using digital projection. Alternatively, you could have a triple-headed 1600x1200 display. I've worked at companies where three 1280x1024 displays per machine were commonplace, so it's not that unreasonable.

  • I was thinking prelauch type stuff :)
    But ya thats the one time these monsters would look cheap is when you check how much it would cost to put one in orbit
  • Alright! Now I know what to ask my Grandpa to get me for Christmas! I think this will finally give me the edge I need to win the CPL. I should pull about 98763214 frames per second in Quake 3. Guess I have to buy a new car to bring it to the next LAN party though.
    Dissenter
  • Wait, by my calculation, a 1600x1200 display has 1920000 pixels. So, each pixel takes 8 bites to store, so we are using 15360000 bytes per frame buffer. That right around 15 megs. Now, we of course want double buffered, so thats 30megs per display. 3 displays is 90 megs.

    Now, we can only drive a maximum of 8 displays per pipeline, so to drive 8 displays at 1600x1200 means 240megs. That means that there is still 80 megs left. Hmm. I guess if we bump up the resolution to 1920x1200 that would increase our frame buffer useage to 294megs, which only leaves us a little over 20megs free.

    Now, 8 displays may sound rediculous, but now that I think about it, these machines are made to run CAVEs and video walls, so maybe 8 displays ain't so bad.
  • Robert Brown & I hashed this out awhile ago on the Beowulf mailing list.

    The nicest thing about the SGI machines is that they have low-latency interconnect. Complete cache coherency is on the order of nanoseconds - not your microsecond latency on SCI or Myrinet, or your millisecond latencies on Ethernet (and those latter latencies are for data transfer only). A lot of supercomputing tasks can be done by a cluster of Linux machines these days; but for exactly the class of applications you're talking about (lots of communication/contention) this is the machine you'd want to run it on. The other class of applications (of course) is detailed simulations with a fine grid size - where else can you get 1TB of shared memory? ;)

    As far as the kernel goes, it's been scaled from 1..512 processors. There is almost no kernel overhead in computational code to begin with anyway (sure, that simulation may run for 100 hours, but it makes about 1000 system calls), but Irix does a pretty decent job of staying out of the way (aside from periodic stupidness of the scheduler anyway).

    No offense, but comparing Linux/BSD/whatever kernel overhead to commercial high-end UNIX overhead is like comparing apples to oranges. Sure, Linux may scale to 8 processors ok, but that's way different than scaling to 512 (which is very difficult to do).

  • Wait, by my calculation, a 1600x1200 display has 1920000 pixels.

    Yep, mea culpa. I was dividing 320MB by 48, not by 6 (or 8, if you assume 32-bit word aligned accesses).

  • Even if that Gates quote is really true (it seems to be more of an urban legend than anything else), he was most definetly not talking about serious computers (workstations, servers, mainframes, super computers).

    He was refering to personal computers running DOS (i'm not sure that anyone outside of Xerox had seriously done any experimenting with GUI's before at least the early 80's). But regardless, it's just silly to keep bringing that up 20 years later...

    End of off-topic comment...

    And yes, I know you were being sarcastic, I just failed to laugh is all... :)
  • by Christopher Thomas ( 11717 ) on Tuesday July 25, 2000 @07:53AM (#907559)
    So the rest of the industry is playing "catchup" to SGI ?! I don't really think there's a huge market for large-scale multiprocessor machines when equivalents can be built up easily from cheap hardware and fast network infrastructure.

    Actually, they can't be.

    This is not a cluster - it's a multiprocessing supercomputer designed as a single unit. The internal busses have far, far greater bandwidth than even the expensive networks in a high-end cluster.

    It does have competition - the Sun Starfire. But that's about it.

    Clusters are definitely useful, and give you by far the best bang-for-the-buck on problems with relatively light communications load, but problems with a heavy communications load are best run on machines with high communications bandwidth, like this one.
  • I think, if you need an SGI box with 256 or 512 processors, nothing else will do... There's no way you can make a beowolf cluster that would come close to the performance of one of these boxes (in my opionion) because the bandwidth inside each box interconnecting the boxes dwarfs anything else available.

    Besides which, this isn't a single user machine. No one in the world is going to buy one so they can sit a user in front of it to make animations. This is the type of box that will get shipped to the DoD, NSA, various univerities, and large corporations that need to build virual prototypes (Detroit and Japan).

    And no. At this level of the market, Linux just can't compete. Lower-end, yes, but not up here...
  • Well, gee, SGI makes a box that their own operating system that it ships with doesn't support. </sarcasm>

    I think you missed the "UNIX is amazing" comment.

  • by foobar104 ( 206452 ) on Tuesday July 25, 2000 @04:21AM (#907562) Journal

    1. The CDROM is on an internal FireWire bus.

    2. The system disk is Fibre Channel.

    3. SGI hasn't made a big deal about it yet, but the system will accept either MIPS or Intel processors in the same CPU modules. The MIPS processors come on one kind of daughtercard, and the Itaniums (Itania?) on another. You can't mix-and-match MIPS and IA-64 CPUs in the same machine, but you can mix-and-match in the same cluster.

    4. The IA-64 based versions of the 3000 series will include the Linux kernel along an some IRIX compatibility layer.

  • by stab ( 26928 ) on Tuesday July 25, 2000 @04:21AM (#907563) Homepage
    Amusing bits from the page:

    Debra Goldfarb, group vice president at analyst firm IDC, agrees: "Modular computing empowers end users to build the kind of environment that they need not only today but over time. SGI, with this product, is really ahead of the curve in the market. We are seeing the [rest of the] industry absolutely trying to catch up" with SGI.

    So the rest of the industry is playing "catchup" to SGI ?! I don't really think there's a huge market for large-scale multiprocessor machines when equivalents can be built up easily from cheap hardware and fast network infrastructure. The last time I saw an SGI was the NASA AMES crew using one for their amazing Viz tool, and even they were making mutterings about porting it to NT and Linux for ease of maintenance and actual use.

    In addition, SGI Origin 3000 servers and SGI Onyx 3000 visualization systems reflect a return to SGI's core competencies.

    At least that's true. The NT machines were a joke. Anyone tried SGI Linux yet?
  • Yes, BUT Linux can't at this time even think about supporting somethin like that. If they sent Linus one, maybe, but I suspect he'd rip out the MIPS processors, replace them with Crusoes, and send them back a 512-processor Crusoe machine... with no cooling :-P
  • I'vealways drooled after these things. but I am curious of bang for buck. How do these compare to sun, and the latest greatest IBM and inel based clusters. Cost is such an important issue these days and with technology ramping forward do you really need something this mean. This ofcourse is in the domain of server machines and not Imiging stations, here I believe SGI will always be king, or queen if you please.
  • With their InfiniteReality3 technology they really need to name a machine the Ontology3000... ugh, I know, bad pun.

  • Aah, yes. The $64000 question. The answer to this is NUMA and hypercube structured interconnect. Check out the specs. Its not an SMP. It is shared memory like an SMP. Looks and acts like an SMP at all processor counts.
  • At Academy we use still Onyx RE2, 4xR4400@200, 640MB RAM, ASO (4 sound channels 6 serial ports) Sirius Video option, as our primary system to build real time virtual sets, our new NT's and G4's are faster for some simple tasks but when game comes to realtime uncompressed video I/0 this 6 years old monster is still killer, details as 1,2 GB of internal bandwith and 48bit color will be reached on x86 once... Yes there are several o2, Octanes and Indigo2, Impacts for 3D and video preprocessing. This machine grandpa of O3k is still usefull and warm ;-)
  • It seems Sun can do these thing - so they can't be that hard -
    (didn't SGI have a rather big lead on sun with SMP ??)

    On the sun "starfire" (E10000) CPU boards( with up to 4 CPU's) can be hot-swapped
    (after migrating the procceses running on them of course ...)

    Also AFAIK CPU(system boards) can be added to the System (with no reboot)
    As well as moving CPU boards between domains ( again: with no reboot)

    Of course the E10000 has no NUMA & tops out at 64 Processors

    BTW You may find it interesting that sun is claiming the E10000 replacment(using UltraSparc III) will have NUMA & scale to 1000 processors
    You may also have noticed that a lot of this functionality (except domains) has been "downported" to the E[3456]500 models


    --
  • How hard this is depends on the architecture of the OS. In Irix it is hard. I don't know how hard it was for Sun to get to where they are. I have heard, however, that Sun's dynamic subtract ability doesn't work under certain conditions. The trick is, with NUMA, you are migrating both the memory and the processes. We can migrate processes very easily. Getting all the memory off the node is a lot harder. Sun doesn't have to deal with this until they get NUMA.

    As for the E10k followon being NUMA, I'll believe it when I see it. Sun has previously said that they don't think NUMA is a good thing. Also, we heard they were working on an architecture called COMA (bad name, but it stands for Cache Only Memory Architecture) where you treat all of memory as a cache and let cache lines move wherever. If they *are* really doing NUMA on 1000 processors, they are going to find that the jump from 64 to 1024 is more like scaling a cliff than a gentle slope... Besides, Sun's NUMA stuff is vapor - ours runs *now* :)

  • by warrior ( 15708 ) on Tuesday July 25, 2000 @08:03AM (#907571) Homepage
    The whole system has one contiguous view of memory. The NU means "non-uniform" as in the memory access time is non-linear. If a process's memory is located on the processor module it's running on, the memory access is fastest. If it has to jump one module away, the memory access time increases by 100 nanoseconds (roundtrip). Architecturally, it's completely different than anything Sun has to offer. Sun has been promising a NUMA mahcine for years and still hasn't delivered. The closest company to SGI is Compaq(DEC), and there top of the line offering can almost compete with Origin _2000_. All other companies high-end servers use symmetric multiprocessing, which becomes limited as more and more processors try to access the shared memory bus, ultimately bringing in negative returns as you add more processors. This NUMA architecture incurs very little (if any ) penalty by ading more processors, as long as the hardware and OS do a good job of placing processes and memory (keeping them physically near). Also, the machine is _not_ limited by 512 processors. To give an example of the power of this box, a company has certain calculations that they run day to day. On their top-of-the-line Sun hardware, it takes about seven hours. On O3k, it takes seven seconds! What does being a modular system have to do with being a cluster? By being "modular" it simply means that you can plug in more of whatever you want, whenever you want. I believe you can even mix faster cpu modules with existing ones as they become available, protecting your investment. This is not a cluster.
  • Look at the Bit width, not benckmarked transfer rates. the MIPS procs are 64-bit, and so that means they process data faster.

    by the way, the x86 scores you quote are exaggerated. most intel pcs would choke at 400 because of bottlenecks.

  • by sjbe ( 173966 ) on Tuesday July 25, 2000 @08:20AM (#907573)
    We use SGI machines where I work and frankly we have been slowly getting rid of them. Why? Except on the very high end, PC's are faster and cheaper. You can buy a PC for about $8000 right now that is faster than an Onyx/2 with InfiniteReality graphics. I know because we did just that. Granted, the Onyx/2 is >2.5 years old but it cost 30 times as much when new. The Onyx/3 will take the lead back for a time and nothing in the PC world can compare to a high end multi-pipe Onyx/Origin even today on raw performance numbers but SGI still is loosing ground to the PC makers. Their performance edge just isn't large enough to come anywhere close to justifying the huge mark up in price for their machines. Only people that buy them are people at the very high end of the market who don't really have much of a choice right now.

    Don't get me wrong, I love SGI's machines and use one daily. Even passed up on a faster PC (running Windows) because I like it so much. But there is no way I could cost justify getting a new one. They simply do not provide enough performance to justify the cost anymore. All the demos of their stuff we've seen doesn't indicate that their new machines are a huge leap in performance. (meaningfully faster to be sure but not nearly enough to justify the cost of a new one) Fortunately for SGI they make a ton of money on each Onyx & Origin they sell but if they aren't careful this could easily evaporate out from under them. They make very cool systems but it is not a well run business IMO. I'll be somewhat suprised if SGI doesn't get bought out by someone in the next year or two.

  • part of the reson why they are selling it is the second processor tray is bad, and it cost about the price to replace the system to repair it
  • Slight clarification - the key in the design is the N in NUMA (NON, as in Non-Uniform Memory Access). A processor accesses memory on it's own board quickly, slower to access memory elsewhere. The question is how much slower "slower" is. This also means you really want processes to stay on their processor, and to have good locality of memory, because otherwise you take latency hits.

    Check out Chapter 7 of Greg Pfister's "In Search of Clusters" (ISBN 0-13-899709-8)
  • by FascDot Killed My Pr ( 24021 ) on Tuesday July 25, 2000 @04:03AM (#907576)
    "This is definitely a great system for anyone who wants to have their computer be the size of several refrigerators."

    I foresee a day when computers may be as small as one refridgerator. Probably there will be a world market for no more than 10 of these.
    --
    Give us our karma back! Punish Karma Whores through meta-mod!
  • I think you are spreading a myth about locking, Godfrey. It's not the addition of locks themselves that cause a performance penalty, after all, they result in a nop on a up system. The gratuitous adding of locks without regard to design (this is what lm complains about often) does add maintainence problems, but that is a matter of design not performance. The only performance penalty to making a system scale in number of cpus is when you have to make tradeoffs in the design of structures and algorithms. Usually you can (with a little more thought) find a design that will help both small system performance and large system performance... But sometimes that isn't possible. In those cases you could split and maintain two separate subsystems for the two designs but this again introduces maintainence headaches. Luckily we haven't gotten to that point yet... And I honestly don't know whether it will be worth it to pursue. (I'm making the assumption that we can scale into the 10's of processors without hitting that point.)
  • This is definitely a great system for anyone who wants to have their computer be the size of several refrigerators ;)

    And wants several refrigerators to cool the system, too. Can Linux even handle that many processors, let alone make good use of them? UNIX is simply amazing...

  • I think these machines are simply awesome, but you have to wonder how many of these really gets sold? Yesterday Ascii white was announced to be sold to the public and now we see this bad boy. Does anyone have a link or figures on how many of these sell? How long does a company keep a supercomputer after buying one? The specs are impressive and so is the price tag, but do many companies, or countries buy these?
  • by pb ( 1020 ) on Tuesday July 25, 2000 @04:24AM (#907580)

    System Bandwidth
    3200: 11.2 Gigabytes/sec
    3400: 44.8 GB/sec
    3800: 716 GB/sec

    ...methinks they skipped a decimal point here.

    (if not, please explain!)
    ---
    pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
  • Well, with a yearly revenue of 2.5 billion, I'm guessing that they certainly sell enough of them... Actually, I don't think every one that moves out has the full complement of 512 processors, either. I was always under the impression that these quite sexy machines were used for, well, not-so-sexy stuff. Plain-jane 3D visualization or transforms from data, brute-force number crunching, done by some of our big, scary corporations (I understand oil companies love these things). As a biochemist, I use a couple of their little baby machines, and should I ever score some fat gov't grant, I can tell you I'd love to be next in line to snap one of these beasts up...

  • BTW, the virtual starship tour (or whatever gimmicky name it's called) at the Adler planetarium uses SGI boxes to do the rendering on a skydome ceiling... can you imagine Descent 3 on that?

    Break out the motion-sickness pills!

  • This is a MIPS-based processor architecture. And MIPS machines have extremely sketchy support in Linux right now; that's what irix is for.

    And the efficiency of the system depends on the efficiency of the processes running on it; if your program knows how to use stuff like MPI and forks itself off lots of times, then yes, you will get extremely good performance. But one process of SETI@Home won't do very well...
  • Yesterday Ascii white was announced to be sold to the public...

    Damn, I missed that, and I was thinking it was custom built for Lawrence Livermore or whomever. Just the thing for the new 3D space shootup I'm writing in Java. PS, how big are two basketball courts?

  • by jd ( 1658 ) <imipak@yahoGINSBERGo.com minus poet> on Tuesday July 25, 2000 @04:26AM (#907585) Homepage Journal
    1. Who buys these computers?
      • Usually, the Mac crowd, to stay at the top of the distributed.net key rate charts.
    2. What practical use is a system like this?
      • It allows you to have a refrigerator that runs your X10 network.
    3. Will it run Linux?
      • Probably. If not, wait a few days. The arch is already there, so the port will be trivial.
    4. Why is it green?
      • SGI are posessed by the Ghost of Lotus Racing, a sad spirit forever doomed to wander the high-tech world, causing companies to develop phenominal technology and then collapse under their own gravity.
  • Well.... before we can support 512 processor MIPS boxen we need to support single processor and dual processor.... IMHO low end multiprocessor SGI box support is where Linux needs to go on the SGI architecture
  • I don't know a whole lot about SMP.

    That said, what's to stop each running thread from using one or four or whatever processors. I mean, unless the software is specifically to use 512 processors, wouldn't it kind of work as a really great multitasker?

    Like I said, I don't know much about SMP.
  • I am sick of this...

    Goatse.cx is not a troll thing! It is a spammer thing!

    All you spammers ran off with the perfectly good troll name and defaced it! It's like the l33t d00dz script kiddies who ran off with the hacker name! You are not trolls... you are spammers!

    </rant>

  • Moderators!!! I believe this was intended as a joke [ibm.com]
  • One process with 512 _threads_ will do _really_ well. No one else ships a single memory image computer anywhere near this big. They can't. SGI took the R&D hit early with the O2k, adopted from the Stanford DASH machine. IRIX took the complexity hit early with 6.4 to improve concurrency/lock granularity.

    Now they ship this monster. For large problems, No OS can touch IRIX, and no hardware can touch this. For people wanting to make the "clusters are better argument", well, if you happen to have the small variety of problem thats "clusterizable", this thing will run those too, and quite well. Furthermore, you can always cluster a bunch of these guys together for _thousands_ of cpus and _terabytes_ of ram... and it will all be using interconnects a shit of a lot faster than what you can get elsewhere.

    Finally, if you pay attention, you'll see that the whole thing is totally modular. It doesn't have to run MIPS cpus. You can yank the C-bricks and throw in an IA-64 c-brick (sometime in the future). It's _NOT_ a MIPS-based architecture. It's a modular supercomputing platform.

    SGI has done their homework adopting the lessons learned in DASH (and later FLASH). As a result they've got the most scalable real-world-useful computer there is.
  • by mashey ( 215057 ) on Tuesday July 25, 2000 @10:00AM (#907591)
    To fix a few misconceptions: 1) The bricks are (mostly) 3U [5.25"], or 4U [7"] high, and the same bricks are used to construct a wild range of systems, with huge variations in CPU-I/O-storage ratios. 2) In some cases, the bricks will be sold separately and embedded into airplanes, vans, etc, by defense contractors. I'm told the submarine folks really love the idea. 3) In a half-rack (SGI Origin 3200), you can have 2-8 CPUs [1-2 C-bricks], a required I/O brick [I-brick], and either another I/O brick (I, P, or X) or a disk brick (D-brick). 4) People always announce a wide range of systems: realistically, most of these machines will be 1-2 rack systems, just like they are for everybody else. People who buy lots of computers use racks anyway - the last thing in the world they want to do is waste precious floorspace. 5) IRIX already scales to 512P fairly well, and NASA AMES runs individual shared-memory jobs on their older Origin2000. It already saved you a lot of tax money. 6) SGI is not shipping Linux on the MIPS-based machines. This is a "Caterpillar" announcement, with a lot of shoes left to drop, like IA-64-Linux versions coming later. A major point of the brick thing is that you can change bricks while re-using most of what you already had; you can for example, introduce a PCI-X, or later, Infiniband brick without obsoleting older I/O bricks. Also, you can build C-bricks with Intel IA-64s, and those will run Linux, not IRIX. All of the rest of the hardware infrastructure & bricks are the same. 7) SGI is working hard with the Linux community on scalability, i.e., to let it handle more CPUs well without damaging the basic Linux. Personally, I doubt that it will make sense to try to scale Linux to where Irix is, but it will certainly scale big enough to be interesting [say 32-64P in single system image]. Using partitioned hardware, one can get NUMAlink speeds between partitions, and that satisfies many customers. 8)The customer should be able to pick the size of machine, and then cluster that size together. For some customers, 1P + 64MB is just dandy, and they buy clusters of IA-32 boxes. I know customers where the right size happens to be 32P, 16GB of memory, 2 disks, and 3 Ethernets [one full rack], and then they cluster a lot of those. I know customers that cluster 128Ps, and there's one who would cluster 512Ps if they had the money. If the NASA Ames folks had the money, what they really want is a single machine with Petabytes of memory and Petaflops. I was sorry to tell them, Not Likely Soon. 9) Don't get too crazy with the fact these systems can go really big. I've lost track, but I think there are 30,000 of the Origin2000s & 200s out there, and most systems are small to medium. Of course, the big systems account for many CPUs. 10) The NUMAflex brick approach has many subtle benefits, but is hard work. In some thread, people mentioned backplanes ... but there aren't backplanes in the normal sense. Each C-brick has 4 MIPS CPUs, memory, and an ASIC Crossbar, with 2 ports out the back for cables that run (peak) rates of 3.2 GB/sec (2 * 1.6GB) and 2.4 GB/sec for I/O to separate I/O bricks. Each brick has internal circuit boards, but there is nothing that looks like a normal CPU backplane. To do this, you have to be able to run 3meter/5meter cables at these rates, and do tricky circuit engineering. Later versions will independently improve the interconnects as well, not just upgrade the bricks.
  • Well, Sun, for example, sold 500 [cnet.com] $1 million-plus refrigerator-sized E10000's [sun.com] last quarter alone. There are probably more people out there who want this kind of stuff than you might think...

  • why not just get a custom box with 4 Voodoo5's or GeForce2 GTS's and whacks of RAM and throw in a couple of Athlon's. Box that does the same graphics processing, 1/10th the price.

    But does it really do the same graphics processing? Can a Voodoo5 or GeForce2 handle 48-bit colour for example (as used by the motion picture industry)? How about a 320MB framebuffer with 256MB texture RAM?

  • Pixar seems to like them, as do lots of cgi companies. One of the biggest strengths for this box/boxs for them is the video processing. You can make a fancy beowulf cluster for cheaper at the same power, but the best video your gonna get is a geforce gts/voodoo5/raedon whatever this mamajamma has 4 gig of video ram available *drool*. Not to mention the pretty boxes it comes in. I'd feel kinda odd as an astronaut walking by 512 beige boxes hooked to the satalite sitting on a metal rack :)
  • We got one of the earlier Onyx machines (creatively named onyx.astro.wisc.edu) back in 1993. It was pretty novel with its dual processors and fast OpenGL hardware. When some SGI programmers ported Doom (and later Quake) to the MIPS chip, some of us grad students used to play on the dept SGI boxes, including that dorm-fridge-sized machine. But for all its lofty framerate scores, our Onyx had no sound, so the poor sucker sitting at that terminal often got fragged with no warning.

    But alas, the proprietary $15,000 memory module fried itself after the warranty expired and the machine was sold (for parts, I guess). No heated footstools in our computer room any more...
  • And they're all sitting (practically) unused in the backroom here at work because some fool in IT thinks that we need an E10000 to serve simple SQL querys and NFS... grr... and they were too clueless to resist our Sun rep's charms, probly...
  • err shuttle :) not satellite *pardon the spelling its early with little caffeine*
  • They are not bought by the Mac crowd. Just because SGI likes pretty blue colors doesn't mean it's the mac crowd. These machines are used by anyone who has a whole lot of budget and some serious number crunching to do (lower budget goes for beowulf cluster).
  • My university still has an Onyx. R10k processor, 128megs ram, RE2 graphics. Oh, we do have the ASO board for 4 sound chanels and an additional 6 serial ports (for a total of ten). We also have the MCO (instead of one 21" display, it can drive 6 TV resolution displays). Currently, the machine is mainly still used only because of the MCO and serial ports. It is darn hard to get multiple video chanels out of an NT machine, not to mention the uglyness of having many serial ports on NT.

    The prof in charge of the machince constantly complains about performance. He seems to think this machine is helplessly out date since the new NT machine can pump out more polys. However, I think that this machine is misused. The first and formost problem is that instead of using OpenGL, then use a library called World Tool Kit(WTK). WTK is easy to use, but it limits what can be done. Specifically, you can't do the types of things with WTK that the Onyx excels at, namely multi pass texturing, so WTK is just emphasizing the machines weeknesses.

    I'm trying to get my basic world system up and running on linux (I don't have much yet) so that I can port it to the Onyx in such a way as to emphasize the Onyx strengths, rather than its weeknesses.
  • One thing I never understood is what the heck is a person to do with a 320mb frame buffer? I just can't come up with any way to use a significant amount of that.
  • It shows how stupid Mr. Gates really is. Which is scarey because through his wealth he has more influence and power than anyone on this chatboard will ever help to have.

    --
  • ALthough A|W announced the Maya Linux port and the renderer is already there, this would be great for a render farm.

    I got to use a 4 node Origin 2100 for a bit until it replaced the 2 node Origin we had for a file server.
    It was the fastest renderer I had. Great for single frames and good for multiple frames.
    If a company needs large frame renders or single frames done fast, a NUMA style machine is needed.
    Having the large capacity of memory alos alows several smaller render jobs at a time.

    A machine like this would be a dream for me.

    I don't think that the cost/performance metric would pay off for the type of rendering that we do but I can see other places that would benifit from it.
  • People who still buy high end SGIs?
    How about these people:
    FF Movie Info [thegia.com]
    Watch the newest trailer.
  • by Anonymous Coward
    In response to the person who asked "who buys these supercomputers". I read an article in the Silicon Valley Metro newspaper, that said that the number one file server brand used by the on-line porn industry is SGI.
  • Yes, there's a decent market for these machines. Given SGI's situation, however (they've restructured every quarter for the past 2 years) and the fact that the (non-embedded) MIPS processor line is a few generations behind similar offerings from IBM and Sun, I've a feeling that many customers with just-fat-enough wallets will take a wait and see on these machines, or just look at similar offerings from more stable companies.

    I don't know about that. We benchmarked a handful of our regularly used programs - mostly molecular dynamics and quantum chemistry stuff - and SGI's 3000 system looked just as good as the competition if not better. In fact, SUN was so bad on floating point performance that we didn't bother to run the full complement of benchmarks. (That'll change late this year, but we needed systems now.)

    Anyway, SGI came out on top when raw CPU performance, system scalability using our codes and I/O were considered together. So, we are going to receive a 3800 as soon as SGI can deliver one. Can't wait!

    As far as company stability goes, no customer is going to buy a truly large machine without a lot of legalese in the contract that spells out what happens if things go wrong. Company failure is usually factored in. Me, I am not concerned. SGI has too much good technology, and also too much cash in the bank to simply disappear from the scene. They could be snapped up by someone else, perhaps, but would that change much? SUN did not make any major changes to the E10000 when they got it from Cray in '96, did they?

  • Those are the actual bandwidth rates. The Origin/Onyx 3000 series is not SMP nor does it have a "Bus". It's a hypercube NUMA setup with a thick mesh of interconnects. This -is- the "Cray" of today. Read up on ccNUMA and hypercube architecture on SGI's website, they have several good technical documents available.
  • Use a VR cave or one of those 120 degree triple projector screens. Peripheral vision is more important than a big screen.
  • As far as maintenance and actual use, it would really be hard to find something similar than an Origin 3000. Even at 512 processors with a cross-section bandwidth of over 700 GB/sec, it's still JUST ONE MACHINE. And it runs IRIX. BRU, INST, TAR, etc... it's a monster of a beast that's as easy to maintain as a workstation.
  • Sure!

    http://reality.sgi.com/sgiquake
  • by Tet ( 2721 ) <.ku.oc.enydartsa. .ta. .todhsals.> on Tuesday July 25, 2000 @04:37AM (#907610) Homepage Journal
    Up to 1 TB RAM and 512 processors, all on a single system (not a cluster)

    With boxen this size, the boundary between a single machine and a cluster tends to get a little blurred anyway. Even SGI are stressing the fact that it's a modular system. Basically, each module has it's own CPUs and memory, and has connectivity to the other modules in the system. What's the difference between that and a conventional cluster? Mostly the phenomenal inter-module bandwidth, but that's just a matter of numbers. Architecturally, is there much difference? OK, so you have a single OS image running across all CPUs, but is that even true any more? Certainly other large systems (e.g., from Sun or Data General) let you run multiple versions of the OS concurrently on a single box as you see fit.

  • Also one machine that can go wrong all at once - flip side to that argument is that you can chop/change a cluster once its installed, and maximise its usage according to what you're doing.

    Granted though, initial installation of an SGI is easier than a cluster.
  • by Anonymous Coward
    Wouldn't you prefer a good game of Global Thermonuclear War?
  • Keep in mind that until a month ago, SGI's top-of-the-line graphics board sets (MXE and IR2) were the same designs that originally appeared as Maximum IMPACT on Indigo2 and InfiniteReality on Onyx R10000. About five years ago, give or take.

    During that time, entire graphics hardware companies have come and gone. The really good ones have caught up to, and occasionally surpassed, what SGI was doing in 1994. Impressive. Most impressive. ;-)

    Now SGI has released Vpro, which despite having one name is actually two totally different workstation graphics designs. The Vpro you can get in the IA-32 workstations is basically high-bin commodity graphics hardware from a company that shall remain nameless.

    But the Vpro that comes in the Octane2 looks outstanding. I haven't had a chance to use it yet, so I won't endorse, but the design specs for the Buzz chip make it look like InfiniteReality performance on the desktop. Way better than anything in the commodity market right now, and way more expensive, too. It's one of those things: if you have to ask how much it costs, you can't afford it.

    1996:
    SGI: Here's our latest graphics architecture.
    The World: Wow!

    1998:
    The World: SGI isn't so great. My NNNN is just as fast as an Onyx!
    SGI:

    1999:
    The World: SGI sucks! AGP cards are better than dedicated workstations! Sell all your stock! Bleah!
    SGI:

    2000:
    SGI: Here's our latest graphics architecture.
    The World: Wow!

    And so we are all a part of the great Circle of Life.

  • Perhaps he should get the biMac or dMac :) check them out here [bbspot.com]
  • lol, it'll be a while before you see SGI's in space though :-)
  • Something this size does well for meterological simulations, atomic weapons research, something that entails MASSIVE numbers of computations. I wouldn't be surprised if you see these in placed like NCAR (National Center for Atmospheric Research) NCSA, the Government labs like Lost Alamos and Larry Livermore. I still remember seeing NCSA's purple monster 1024 node cluster of Origin 2000's (using an experimental node bridge)

    As for how this is different than a Beowulf cluster, look at the bandwidth! Even with switched 10/100 Ethernet as your Beowulf 'backplane' most switches have just enough backplane bandwidth to handle every 100 Mb connection, some have a little less. sgi has always had amazing bandwidth numbers, this is just taken to the N'th degree.

    AND this is one machine, one OS, unlike a cluster of many independant machines, much easier to administer.

    These are simply awesome machines, now maybe sgi can sell a bah-zillion of them and I can get my Indy sold ;)

    g:wq
  • PC getting slow or out of date? Add a new processor brick, that gets detected and used with just a reboot.

    Any operating system that requires a reboot to detect a new config is not worthy to ever be called a Server OS. Uptime is uptime, even when memory needs to be swapped or a disk added.

    We will only come out of these dark ages of clunky cumbersome computing is we insist on it. Requiring reboots is evil, and should be minimized, whether for hard or software. Architect for it.

  • testing my sig, testing testing
  • You of course wont have to do with this REAL hardware. If you have a faulty C-brick, disable its resources, or partition it off, and then power it down. The rest of the ram/cpus operate normally. No rebooting required.

    FWIW, the O3k qualifies as "real hardware".
    Redundancy. What a beautiful thing.
  • There is a very important difference between ccNUMA machines like the O2000/O3000/Sequent and something like Ethernet (of any flavor). That is that the communication doesn't have to go through the I/O channel, which means *zero* syscalls to do communication between threads - it's all memorymemory. That means much lower latency on things like MPI jobs. Cards like Myrinet are trying to accomplish the same thing (direct user->user transfers) but as far as I know, you can only push data down them, not pull data from the other side.
  • Lots of people would be happy with that. Of course, it's really hard to pull off. IBM makes UDB2 for NT clusters and SP machines. I don't konw if its "fully distributed" or not, but I want to say that it is.

    It's tricky enough to design file-systems that are properly distributed. I did some design for a school thesis [unl.edu] for a serverless distributed file-system with useful fault tolerance features. Thats pretty tricky in and of itself, even to support UNIX file-semantics. Building on something like that to build a strong and safe RDBMS would be quite a feat.

    People _really_ like the single-machine programming paradigm. The OS at every level needs to emulate that behavior as much as possible, regardless of the reality of the situation. Hence, the need for a good file-system. (see Berkely xFS for the right approach, or Centravision for a shipping product looks interesting). RDBMS are already choked by locking algorithms and contention on SINGLE CPU machines. It should be no surprise that a fast RDBMS that is fully distributed and scalable isn't widely available. To do it right you've got to have transparent internal replication of basially everything. Not just data and meta-data, but even logic. Coming up with a serverless (and thus usefully scalable) scheme that gives strong enough guarantees for RDBMS applications yet still survives and survives corectly and quickly and doesn't bog down the system with locking will be quite a feat for whoever manages to do it.

  • This, sadly, is not correct. You can hot swap I/O on Origin 3000 but *not* C-bricks without a reboot. Yeah, it'd be really nice if that weren't true, but the software to subtract running processors from the system is *REALLY*REALLY* hard. It may get done at some point (ie, the point where there's enough customer demand and engineer resources to make it feasable), though, since hard==fun for kernel programmers :) Brick addition (ie, adding a C-brick to a running system without rebooting the system) is easier and is will probably happen before brick subtraction. Neither are committed projects (*sniff* :)

    However, what you *can* do is shut down a single partition of a multi-partition system without affecting the rest of it. Also, we have some stuff in Irix to throw away pages that have double bit errors in them without panic'ing the system in some cases. More RAS features are planned to be added over time.

    And I definetly agree - O3k is "real hardware" :)

  • by Anonymous Coward
    You wrote:

    It's possible to build a shared memory cluster, although I don't know of anyone doing so.

    Yes it is possible with some very large caveats. Most important is that there is little to no OS support for large memories (>8 GB or 64 GB in the 2.4.0 kernel). Secondly, the compiler technology doesn't know anything about the architecture of the machine (on the MIPS machine, it does), so it cannot schedule resources appropriately. Finally, the interconnect in a cluster is at least one order of magnitude higher in latency and at least one order of magnitude lower in bandwidth, which is terrifically critical to the performance of fine grained parallel codes (the raison d'etre for large SSI machines).

    So, yes, you might be able to build a machine with superficially the same properties, but it would cost you a tremendous time and effort to achieve anything like what SGI has. The question that immediately comes to mind is why you would want to do this. Certainly any such machine you could build (even out of commodity components) would not be cheaper if you factor in all the time and engineering effort that would need to go into it to achieve the same results.

    You also wrote:

    Yes, but not without cost. Local NUMA memory accesses will be noticably quicker than remote NUMA memory accesses. Building a shared memory cluster from separate machines will give you the same properties, although the difference between local and remote accesses will be much greater.

    I have to scratch my head over this. The word noticably is what bothers me. Each router that the memory traverses in the origin adds ~200ns to the cost of getting the cache line. The O2k has at max 4 router traversals. The cacheline is at most ~1100 ns away. With page migration, you can even move frequently accessed pages around, though this is a hard tunable to adjust for, and largely you get marginal improvements at best. The O3k has a lower latency traversal cost per router. Something like 1/3 of the O2k. The bandwidth is also about 2x better. So noticably here becomes ~600 ns.

    I don't know about you, but I won't notice that.

    Moreover, with the R12k out of order execution, a cacheline stall will not halt the computation. The R12k can have up to 6 cacheline misses being handled before the EU stalls. So remote pages (and their latency) is not noticable.

    The cost to access remote pages is largely hidden and not noticable. On a cluster, there is no OS support for remote pages. You have to use PVM/MPI/flavor of the day to send messages and pages. Latencies are on the orders of tens of microseconds at best. You will most certainly not get the same performance characteristics, and very different scaling properties.

  • 4. The IA-64 based versions of the 3000 series will include the Linux kernel along an some IRIX compatibility layer.

    I wonder how this can be possible -- it sounds like something marketing pushed through because "Linux is very hip". Who the f. would be dumb enough to run Linux/Intel on one of these things?

    Linux scales well to what, 4 processors?

    Of course, this is my uninformed impression. But IMO Linux is for PCs -- and there's a helluva big difference between Linux & Cox and the people at SGI, and certainly also between the software they write.
  • Crap name, but I really think this 'brick' implementation is a great idea, and although I don't doubt the backplane/bus adds a certain amount of overhead to the cost, it'd be nice to see this sort of thing on Workstation and Desktop systems. And yes, I know similar things have been tired before (Acorn?)

    PC getting slow or out of date? Add a new processor brick, that gets detected and used with just a reboot. Keep the old brick if you want. Graphics too slow? Just bought a second 19" monitor? Add a new graphics brick.

    Im not suggesting this is a cheap or easy solution (yet) but its a much nicer one that PCI slots, and a tidier one than USB...

    Pax,

    White Rabbit +++ Divide by Cucumber Error ++

  • It isn't just the graphics, the the architecture of the entire box. You cannot compete with the pipelines on an SGI box with a x86. Its just pointless.

    Ever wonder why Pixar has so many SGIs? It isn't because they have the extra money to burn. Its because SGI _IS_ the best at graphics. Until you use one for visualization (my department does a LOT of vis work - combat simulation), you have no idea the power of these things.
  • by Cy Guy ( 56083 ) on Tuesday July 25, 2000 @05:25AM (#907634) Homepage Journal
    These machines are used by anyone who has a whole lot of budget and some serious number crunching to do

    Big entertainment companies seem to be at the front of the line for the new systems. Here are some SGI press releases:

    Sony Computer Entertainment Inc. (SCEI) Selects SGI Origin 3000 Series As Broadband Server for Next Generation Entertainment Demonstration [yahoo.com]

    SGI Is Preferred Provider of Content Creation Workstations and Servers For Industrial Light & Magic (ILM) [yahoo.com]

    Pixar Selects Silicon Graphics Octane2 High-Performance Visual Workstations As Production Platform [yahoo.com]

    So you can expect Star Wars Episode 2, and Toy Story 3, to harness the power of these babies.

  • "Well, sure, the Frinkiac-7 looks impressive, don't touch it! But I predict that within 100 years computers will be twice as powerful, 10,000 times larger, and so expensive that only the five richest kings of Europe will own them."
    From "Much Apu About Nothing" ( Season 7 )

    See "The Definitive Frink" [internerd.com]
    cheers,
    j.

  • by LightLiner ( 1785 ) on Tuesday July 25, 2000 @05:27AM (#907639)
    Login to your beowulf node 1 and try to access memory (r/w) on node 2. Can't do it without message passing or shared memory libraries. Clusters require special programming to go node to node.

    Login to a 3000, you don't even know what node your on, in fact the system doesn't give you any impression that its any different from a small up or mp. The thing that tips you off is the load of 200, 400, or 500+. Depending on whats going on on the system, your process may be migrated from one node to another without you noticing. On the 3000, any process on any processor can access every page on every node -- all through regular memory references.

  • by ajs ( 35943 ) <{ajs} {at} {ajs.com}> on Tuesday July 25, 2000 @05:31AM (#907641) Homepage Journal
    People are asking things like "why would I use this" and "who wants these?" Let me tell you, in the era of bloatware like Oracle and any of the content management systems out there (possible exception of Zope), the incredible scalability of these systems will be a huge selling point. Oracle, for example, is very careful to build and market their software to be monolithic so that you have to buy big hardware to run it, and then they charge you based on the size of the hardware you're running. Thus, they drive the purchase of huge systems like this, and then charge you up the ass for their "Enterprise class" database.

    Believe it or not, this is actually the kind of business model that the Fortune 500 are not only happy with, but demand.

    Personally, I'd be happy with a database that could run on a loose, fault-tollerant network of a dozen or so small (e.g. 2-processor Intel or Alpha) systems.

    Then again, I'd really like to play with some of SGI's big iron....
  • NOW it's all so clear to me as to why iD would want to sell off it's SGI PowerHaus(tm).

    The new models are on the way!

    Rami James
    Guy with Duh.
    --
  • by b0z ( 191086 ) on Tuesday July 25, 2000 @04:05AM (#907645) Homepage Journal
    Geeky girls are often impressed by the size and power of your computer equipment. However, size is not the most important thing, it's how you operate it.
  • Anonymous is uninformed. Digital Domain used a Linux render-farm for Titanic, but as usual at DD, the bulk of the 3D interactive work was done on SGIs (and some Macs, and PCs with NT). This is very typical: renderfarms are whatever the company can get for the lowest cost/rendermark (or equivalent), and they don't use any graphics hardware, just the CPUs. For example, Sun gave Pixar a great deal on a renderfarm ... and they still buy OCTANE2s for their interactive work.

    It is trivial to check:
    http://www.d2.com/text/faq/main.html
    and see what tools they use.

    In the last 10 years, consider all of the films that won Academy awards for Computer-Generated special effects, and add in all of those nominated. Of these films, can you name the films that did *not* use SGI?

    Finally, to avoid this being an SGI versus LInux, do recall that SGI is seriously investing in LInux work and contributing to the community in this turf, so it's not like we dislike it, just the facts.
  • SGI closed its doors for the last time today despite announcing record profits.

    "We just ran out of names beginning with O" said Spokesman Otto Olson, head of names. The Ohshit and the Omygod were really scraping the bottom of the barrel.

    Oliver Ottowan added "We really should have used a more common letterlike T or S."
  • by Andy Dodd ( 701 ) <atd7NO@SPAMcornell.edu> on Tuesday July 25, 2000 @04:53AM (#907655) Homepage
    As someone earlier said, most users probably won't go for the full 512 processors.

    I see the 8-processor boxes being a hot seller in a lot of research labs, or where people just want a centralized server.

    These machines are very similar to an SMP machine from a programmer's perspective. (From a hardware perspective, they're vastly different, each CPU has its own local memory, although the entire system memory is treated as one big block. It just happens that local memory is much faster to access.)

    We have an older 8-processor SGI machine at work that people use to do scientific simulations. Rarely are the simulations themselves paralellized, but instead, people log in and the system gives em' a processor all to themselves if one is free. I think my boss is looking to replace it eventually... Any time someone gets a new system, he wants people to run some benchmarks he wrote. My 500 MHz Coppermine gets twice the performance of a processor on the old machine for small problems, but as soon as the dataset gets larger than the CPU cache, the SGI's excellent memory system kicks in.
  • 1 Terrabyte of memory?

    <sarcasm>
    No one will ever need more than 512Kb.

    Now excuse me while I write a program that has all of the bugs^H^H^H^Hfeatures of Internet Explorer, Word, Excel, and (everyone's favorite) Outlook.

    </sarcasm>

    Hmm, the sarcasm doesn't seem to have stopped.

    Devil Ducky
  • NVidia made SGI's "VPro" graphics chipsets...

    You're half right. There are three-and-a-half flavors of Vpro right now. There's V3/VR3, which is an nVidia board with 32 or 64 MB of DDR RAM.

    Then there's V6/V8, also known as Odyssey. These are available only in Octane2. They're an all-SGI design with the Buzz chip-- "OpenGL on a Chip!"-- at the heart.

    There's talk of a V12, which I think is supposed to be a two-Buzz version of V8. That, if it happens, will be exactly twice the geometry performance of V8.

    Odyssey-- V6, V8, V12-- look on paper like they're light-years ahead of the nVidia stuff you find in the 230/330/530 systems. I say "look on paper" because I haven't used one myself. Disclaim, disclaim.

  • Finally-- my very own regeneration alcove :)

    -j
  • Login to your beowulf node 1 and try to access memory (r/w) on node 2.

    Agreed. However, cluster != beowulf. Beowulf is just one particular type of cluster (aiming for performance). Other clusters provide high availability. It's possible to build a shared memory cluster, although I don't know of anyone doing so.

    On the 3000, any process on any processor can access every page on every node -- all through regular memory references.

    Yes, but not without cost. Local NUMA memory accesses will be noticably quicker than remote NUMA memory accesses. Building a shared memory cluster from separate machines will give you the same properties, although the difference between local and remote accesses will be much greater.

  • Well, with a yearly revenue of 2.5 billion, I'm guessing that they certainly sell enough of them...
    Yeah, but how many do they have to sell to get $2.5 billion? Five or ten? :)

    ---------///----------

  • by Anonymous Coward
    No decimals skipped. 3200 -> 32 cpus, 3400 -> 64 CPUs, 3800 -> (for customers with big wallets) 1024 CPUs. The bandwidth scales with the number of CPUs due to the fundamental architecture.

    Yes, it will be years before Sun, IBM, HP can come up with a similar machine. Why is this important? Just read the SGI finicial statement where they declared their loss. They had orders for $100M even before this machine was announced. I am told that these things are being ordered so fast that manufacturing cannot keep up.

    SGI is back. And they appear poised to kick some serious ass. I just hope that they start advertising so i can show my bosses that they are not dead.

  • Sounds like someone is playing a numbers game to me. The SMP effect is going to chew their lunch on performance. Say you have an application that runs on ten processors. Now, can you imagine the new performance level if you change that to 100 processors? It won't be even close to a 1000% increase. You'll have contention between the processes, and contention (especially) within the kernel. (Hell, I get it with just 12 processors, depending on the application that is running.)

    If they know something I don't here, I'd love to see it.
  • by rockwall ( 213803 ) on Tuesday July 25, 2000 @04:18AM (#907697)
    As great as it is to see SGI's moves to utilize Linux, computers like these demonstrate that Irix still has a place in the larger picture. Irix is really a pretty neat operating system, and frankly, it can scale in ways that Linux just isn't ready to yet. As long as SGI is still making systems like these on the high end, I don't see Irix being displaced anytime soon.

    Of course, Irix also has a lot of graphics production tools that you don't find on any OS, Linux included. That's something else that'll keep Irix around, at least until equivalents exist. Ideally, we'd see SGI continue to take steps toward open source/Free software, with Irix components.

    Anyway, looks like a pretty cool new system from the people who brought us the original colored computer. Can't wait to get my hands on one of these.

    yours,
    john

"May your future be limited only by your dreams." -- Christa McAuliffe

Working...