Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Silicon Graphics

World's Fastest Supercomputer to be Linux 108

xinit was one of the people who pointed us to the CNET story running about the possibility that a current bid by SGI for a supercomputer could be run on Linux. The supercomputer could be the fastest in the world at time of its production. SGI has confirmed the bid, saying it's being targeted for 2001, if the bid is accepted. The placement would be Los Alamos National Laboratory.
This discussion has been archived. No new comments can be posted.

World's Fastest Supercomputer to be Linux

Comments Filter:
  • by copito ( 1846 )
    The article is rampant speculation. This does not mean that it couldn't happen, although I wouldn't hold my breath.

    Perhaps we need a new term for a product that even the company which is supposed to be developing it is unwilling to speculate on.

    Any votes for:
    assware (as in the journalist pulled one out of his ass).
    deadlineware (just meeting a deadline)
    slashdotware (targeted at attracting the /. effect)
    pipedreamware...
    you get the picture

    --
  • by substrate ( 2628 ) on Saturday October 30, 1999 @07:32AM (#1575905)
    Unicos runs on the Cray family of supercomputers such as the T90 and SV1 families of vector processing supercomputers (which use custom designed vector processors) or the T3E family of massively parallel supercomputers (which use DEC Alpha processors). SGI is spinning off Cray or something effectively similar to that. SGI won't be releasing any Unicos running computers in the future.

    SGI also makes massively parallel computers, which if properly configured (read lots and lots of processors) are supercomputer class machines. These machines presently run, hold on to your hats, IRIX. These machines presently use MIPS processors. One of these machines is part of the ASCI contract (Accelerated Strategic Computing Initiative) and is based on an Origin 2000 system [sgi.com].

    Right now SGI is developing CC-NUMA computers (the same multiprocessing technology behind the Origin 2000 computers) using Intel IA-64 processors. Rather than attempting to port IRIX to an Intel processor or pretending that Windows NT will scale SGI is relying on Linux. Right now Linux can't do it, but SGI is working on improving that aspect of Linux. This is all stuff thats been posted to slashdot before. Here's a blurb [fi.udc.es] to that effect.
  • by copito ( 1846 ) on Saturday October 30, 1999 @01:40AM (#1575906)
    There are several Linux clusters in the top 500, not individual computers and certainly not anything that could replace an E10K easily. While Beowulf is great for a certain class of parallelizable problems. A Big Iron database server such as the E10K is useful for it's single system image and huge memory access as well as fault tolerance and failover capablities in hardware and software. I'm not convinced that Linux will ever be used in such as system, since the advantages of open source are not as great in an environment where there are few end users, and where the end users already are spending enormous amounts of money on hardware, software, and support contracts. So the cost of an OS (really just the price of the OS support contract) is minimal in comparison to other costs. And the ability of the hardware to work tightly with the OS is a major selling point.

    Linux will continue to thrive in the low-end and will migrate up to more and more powerful servers as they get cheaper and used more generally. High Availability solutions are already beginning to surface, as with TurboLinux. This will probably be the way to go for most modest sized enterprise applications.

    The only way Linux will get onto a Big Iron box is for SGI, IBM or Sun to put it on there. The only good reason to do this would be to ease migration from low end solutions running on Linux. Or to appease the PHB's that demand a Linux based solution (just wait...It'll happen). Since they wouldn't abandon their current customers, they would be supporting two OSes in the same space with the same developers for quite some time. While the Linux solution would be open source, there would not be great advantages to this since the user community would be so small.
    --
  • They could use VMWare :-)

    But seriously, as I have mentioned in other posts, this is unconfirmed speculation about an unbuilt machine. FUD might be bad, but vaporware is worse. This appears to be slashdotware (a product invented by the journalist to attract hits).
    --
  • When Steven Chen left Cray to start (SuperComputer systems?) they took over the old PC Board facilites in Eau Claire.

    This was back in the days when the model for a fast machine was a big processor. And lo and behold, with the funding they had (read: NSA) they were able to produce a big Al clad box that DID run.

    The OS they used for that project?

    Linux.

    What happened? Well, processors got faster and cheaper. So the need for a hi dollar mondo machine has fallen off. And today, the whole supercomputer industry is hurting.

    So, a Cray/Chen/SGI/Linux connection. And I'm betting the 'linux champions' from Chen's venture are now back working for Cray/SGI. (and Intel picked up the other stragglers)

    Something else to think about:
    The OS is nothing more than a way to make the hardware useful. And, if the company can use the work of others, it lowers their development costs. Thus, it is now a race to the bottom (cost wise) with OpenSourced BSD and GNU/Linux being the lowest development costs and licencing fees.

    SGI is fighting to exist, and OpenSource will help them do just that.
  • I've read about computations which determine the mass of a proton within the context of a particular physics theory.

    The result would be a single floating point number! Or maybe I'm just simplifying things...
  • by RISCy Business ( 27981 ) on Saturday October 30, 1999 @08:20AM (#1575913) Homepage
    You know, I have to wonder, when will you people realize that it's a pipedream?

    If this system is to be ready by 2001, and is to be faster than ASCI Red (9,282 Intel Pentium Pro 200MHz with 1024k cache each, using proprietary interconnects (IP over SCSI, IIRC)) then it's not going to happen.

    Even if the gov't is willing to sacrifice the reliability of the system, Linux is not ready and will not be for many years. Period. You're talking of going from 2 processors, which only works on x86 and Alpha currently, to over a thousand.

    Sorry, folks, it isn't going to happen. It's a matter of rapid development, and availability. Sure, SGI could probably do it, but not by due date. Nor would it perform as necessary due to the requirement of extensive assembly-level optimization in both compilers and kernels.

    ASCI Blue Pacific, IBM's entry, is one of the most powerful computers in the world. And what I'm about to say will probably send most of you into denial and/or shock.

    Blue Pacific isn't customized all that much. In fact, barely customized. It's nearly the same machine you can order for your business today. Perhaps even slightly slower.

    That's right - ASCI Blue Pacific CTR SP Silver and ASCI Blue Pacific, #11 and #2 respectively on the Top 500 list, are retail systems with some additional software. IBM's SP Silver in Poughkeepsie is a retail SP.

    Shocking, isn't it? That someone can build a supercomputer that any business can buy. IBM holds a *lot* of the first 50. You don't see Sun till 54 with a machine that was totally custom built. Hell, look at #20! IBM SP Power3 200MHz. *200MHz!* And it smokes 480 supercomputers! That should tell you something right there. Now, SGI's talking about, more than likely, an x86 supercomputer or IA64 supercomputer, that's supposed to run Linux, have more than a thousand processors, and outperform ASCI Red? Nope, 'fraid not, folks. Maybe in 5 or 6 years, but not one and a half. Sorry. Deal with it.

    -RISCy Business | Rabid unix guy, networking guru
  • by Tony-A ( 29931 )
    >>No, if they go with Linux it'll be distributed to the US government(/military). The US government and SGI are separate entities, therefore that counts as distribution. The GPL applies, and SGI has to release all its source code patches on request.
    True, but only to the US government(/military). SGI is under NO obligation to release source or binary to ANYONE else.
  • Its really sad when Slashdot gets to the point where something like this is not moderated up, and the author has to hide behind an AC. Yes it is a fact that NT scales better than Linux. Why you would want NT vs. Solaris or something more stable, i don't know, but I suppose uptime is not as important for a Supercomputer than it is for a server. When it gets down to it NT is just better for some things. Although I doubt either Linux or NT will scale well enough for a super computer. Now if they could port BeOS to an 8 proc IA64 cluster that would be something. Ohhhhhhh.
  • I remember being told by my father-in-law (who is an IBMer) that the reason that IBM didn't make a bid for this is that they didn't feel that they could make enough money on the project to justify it. I was just doing some Big Brother....uhhh....Blue bashing.

    And actually the saying _does_ go "No one ever got fired for buying IBM.
  • by Anonymous Coward
    A few things have been posted which are either a little off-base or just not terribly informative. I thought I'd try to clear some of this up, and (more importantly) direct you to some places where you can find out more. Take this however you like : I do work for LANL, and use some of the systems described here on a fairly regular basis, but that doesn't necessarily mean this information is perfect. Ahem.

    Blue Mountain has two parts, an open and a secure side. As far as I can tell (and this is not terribly surprising), only details on the open side are available from press releases, etc. Anyway, it's a big beefy SGI Origin2000 system, with lots and lots of boxes each holding lots and lots of processors. (Sorry about the vagueness here -- you can probably find details if you look hard enough.) We're talking thousands of processors here, in case that wasn't abundantly clear.

    My slightly-biased opinion would be that, in light of the many millions of dollars which were undoubtedly spent on said machines, it is extremely unlikely that the cluster would be ditched anytime in the near future, even if we end up getting a faster cluster -- you can always use more computing power. :-)

    For lots more info, check out this press release [lanl.gov], which gives some (now outdated) details on nirvana, the open part of blue mountain. Also, the ACL site at Los Alamos [lanl.gov] is pretty good, though a big PR-y. It also has details about the (currently extant) Linux cluster, in case you're interested. Finally, if you're curious about the real details of the Blue Mountain operating environment, you can take a look at this page [lanl.gov], which has lots of good info.

    Have fun.

  • I wonder if Kasporov will try beat it at chess.
    ---
  • My supervisor was at that big parallel computing conference 2 weeks ago, and he told me that the number of nodes at Sandia amounted to 9000.

    Is that correct I wonder, last time I checked they were talking about.. 4500 nodes.., hmm are those dual? ;)

    Whatever, I'm sure Sandia will get some upgrade by 2001.. But I like the idea of fastest computer being a Beowulf cluster.

  • Slashdot is lots of machines, DejaNews is lots of machines, Etoys may or may not be lots of machines.

    Frankly enterprise ready is a meaningless buzzword, but I at least, tend to think of large single system image Big Iron database servers like the E10K. In that respect neither Linux nor Microsoft is enterprise ready.


    --
  • They've confirmed it as a possibility (see original comment). SGI is actively working on improving Linux's SMP support for IA64. A timetable of two years is described as "agressive" in the article, but they say that the work could be completed in stages - it may not actually be necessary to have the whole machine up and running from day one of the rollout, so that makes it more feasible.

    It's worth reading the article before posting!

  • SGI is willing to speculate on it. It's worth reading the original comment and the story before posting!

  • Linux is good, but it deserves to have it's triumphs accurately reported. The supercomputer in question is a bid which does not necessarily specify Linux and will not necessarily be built. Contrary to what the Slashdot summary said, SGI did not even confirm that Linux is a possibility. All they confirmed was that they made a bid. The rest is speculation and wishful thinking.
    --
  • ...or maybe none at all.

    I doubt that. Maybe in the days when all the results came straight out on the line printer... :-)

  • No, they didn't even confirm the possibility of using Linux, unlike the Slashdot summary stated. I suspect that they will use a tried and true method that demonstrates their strengths in high end Cray and Irix technology. After all, the #1 supercomputer is not built for profit, but as an advertising tool.
    --
  • They didn't make a bid for this contract. They already have a $94m supercomputer contract ongoing.

    Worth reading the original story before posting!

    Hypothetically, if they were in the running, they might well say "Okay, our solution could run Linux if you want. Remember, no-one ever got fired for buying IBM!" Not those exact words, obviously.

  • Yes, but ask Dell or IBM a few years ago if they would support Linux, and they'd reply "No. We don't do that." or "Linux? Who?"

    They wouldn't say "That's possible" - a few years ago, Linux was regarded as a fringe OS. If nothing else it's a testament to how far Linux has come (but we knew that already! :-)

  • There are only three processor types that SGI would consider: MIPS, ia32, ia64

    On MIPS, the OS would be Irix.

    On ia32 it could be eithor Irix or Linux

    On ia64 it would be Linux.

    Since SGI is planning on a big move twards ia64/Linux... it's not all that unlikely that they're going to want to use that for the supercomputer.

  • All I can find in the actual article (not the /. summary) is that they confirm that they are making a bid. I can't find anything from SGI which confirms the possibility of Linux. I would be quite surprised by this if it were the case.
    --
  • No, if they go with Linux it'll be distributed to the US government(/military). The US government and SGI are separate entities, therefore that counts as distribution. The GPL applies, and SGI has to release all its source code patches on request.

  • It's run for the Department of Energy.
  • by Kragen Sitaker ( 1440 ) <kragen@pobox.com> on Saturday October 30, 1999 @09:09AM (#1575938) Homepage
    Just some responses to some things people have said in other comments:

    SGI has more than one line of supercomputers. ASCI Blue Mountain is an SGI Origin2000 machine. I don't think we can expect to see Linux on Crays in the near future. (And didn't SGI just divest Cray again?)

    It's true that Linux hasn't currently "mastered" 16-CPU SMPs. If Larry McVoy is to be believed, that's probably a good thing for the correctness and stability of the kernel.

    CPlant is number 129 on the TOP500 list; it's the fastest Linux machine currently listed that runs Linux. It used to be below 100, but more new machines were added.

    The 1000-node genetic-programming cluster mentioned recently on Slashdot, and distributed.net, are not on the list at all; to get on the TOP500 list, you need to run LINPACK fast. This (a) does not interest some people, and (b) is not well-suited to the structure of some clusters. A parallel machine that is very fast for some tasks may be very poor at others.

    With regard to DES cracking: the EFF's DES cracker, which cost less than a quarter of a million dollars to build, cracks DES keys in a matter of days. Such a machine can scale linearly. The fact that distributed.net takes a month to crack a single DES key does not demonstrate that the NSA requires months to do the same.

    Generally, "secure" DoD sites are not connected to the Internet, auditing or no.

    "supercomputer" and "enterprise server" are very different categories. "enterprise server" means "mainframe killer" -- that is, reasonable CPU speed, fast I/O, but above all, reliability. Linux is definitely fit for supercomputing, and is being used for supercomputing all over the world. Linux is probably not quite yet fit for being an "enterprise server".

    However, many supercomputers do indeed need lots of disk storage.

    With regard to http://www.gapcon.org/listg.html [gapcon.org]: someone said, "You will notice there are no Linux installations in that list." Actually, they list a bunch of machines from Atipa Linux Solutions at LANL, the Avalon Beowulf at LANL, the Parnass2 Beowulf at the University of Bonn, the LoBoS Beowulf at the National Institutes of Health, the Centurion Beowulf at the University of Virginia, and possibly some others. They're a minority, but they're way cheap, and they're growing fast.

    With regard to the GPL: if I hack something proprietary into Linux, I need to give source, licensed under the GPL, only to people whom I give binaries to. I am under no obligation to give source to anyone else. However, the person to whom I give it can put it up on their FTP site if they want.

    Kragen Sitaker, current Beowulf FAQ maintainer

  • Actually no, nobody is saying that the NSA is using a general purpose supercomputer to crack DES. Specialized hardware is clearly the way to go. Witness The DES cracker [eff.org] built by the EFF for US$250,000. This is a purely brute force attack. Even so, along with Distributed.net, they broke DES in 22 hours. The NSA could probably use more efficient techniques of cryptanalysis and more expensive hardware to be faster.

    As far as I know, no one in the private sector has built such a beast for cracking RC5, but it could certainly be done.

    The NSA would typically use a supercomputer only for the last step in certain factoring algorithms for breaking RSA or other such problems that require a single system image and huge amounts of memory. This was the technique used by a group of researchers that cracked [slashdot.org] a 512 bit RSA key. The first phase was distributed, but the last step required a single supercomputer.



    --
  • I apologize. I didn't see the "T30" when I glossed over the C-net post. This is a bit confusing. There is no "T30" product. It might mean a "T3E" but those use DEC Alpha processors and run a UNICOS variant. It's possible but highly unlikely that Cray is building a follow on to the T3E based on IA-64 and Linux. I consider it highly unlikely because they'd have to have started the hardware development cycle long ago to make 2001. The Cray division was supplying me with a paycheck and I didn't hear any rumors about Intel based T3E follow up products. Cray is developing interesting things, but I don't think there's anything T3E-like in the pipeline. Cray won't be moving to Linux.

    There's not enough information to try and figure out what the reporter heard. i.e. we don't know what was fact and what is conjecture. SGI has said that Linux is going to play a large roll in the future. SGI is going to make Linux scale beyond simple bus based 2 or 4 processor systems. Linux will be running on CC-NUMA machines.

    Conjecture 1: A T3 class machine is being bid. The reporter remembered that SGI made a commitment to Linux and mixed and matched.

    Conjecture 2: A CC-NUMA machine is being bid, "T30" doesn't mean anything. The machine will be running IA-64. Linux will be the operating system.

    Since I don't even have a press release in my work mailbox I can't guess as to the real scenario. I'll ask for some information next week.
  • by substrate ( 2628 )
    Sorry, following up to myself.

    Conjecture 3: Hybrid system based on vector processors such as the SV1 and CC-NUMA like the O2000.
  • by copito ( 1846 )
    Where does it say SGI confirms they will be using Linux in the story I see it in the Slashdot summary, but unless Hemos got on the phone to SGI, I think it's likely just a misunderstanding.
    --
  • I remember it was origionally 7,000 - 7,5000, and then the steps for upgrading to 9,000 (or so) proccessors was going to take place. Of course, that was the idea when it launched, so I'm sure its been at 9,000 for a long time.

    ASCII Red is at Livermore Labs, right? To bad, I'm told, it hasn't been to useful for NIF...
  • by copito ( 1846 )
    I was agreeing with you until:
    BTW, did you hear that IBM is standardizing all their desktops on Windows2000? If Linux is soooo much better why is that?

    The real answer is that what is best for me is what is best for me. That could be Linux, Windows, or Xenix. Frankly the actions of IBM or SGI are not extremely important in this choice unless they demonstrate that my particular application is going to run better (according to my criteria) than another.


    --

  • Correction - Microsoft has announced that Windows 2000 "DataCenter" will support 32 CPUs, and journalists are speculating that this product will ship Q2 or Q3 2000. (Of course, they were speculating that Win2000 would ship last month.)


    --
  • by Rendus ( 2430 ) <rendus@gm[ ].com ['ail' in gap]> on Friday October 29, 1999 @11:35PM (#1575948)
    The second fastest current supercomputer (ASCI Blue) resides at LANL as well. Will they be running these concurrently, or are they scrapping their current cluster to put this one in?

    The current fastest, ASCI Red, is located in Sandia National Labs. Both these systems were built by Intel I believe, and are gigantic clusters running some custom software.

    The biggest Linux box/cluster/whatever is Avalon I believe, currently ranked #160, and also resides in LANL. Wasn't there one in the 50-60 range as well?

    Personally I can't see this being anything but a good thing for Linux, both in terms of another selling point (Hey, it's good enough to be on the world's fastest computer!) as well as (hopefully) advancements in scalability (I can't imagine SGI implementing a massive cluster of single CPU boxen, meaning they may take a long hard look at SMP code and optomize it for whatever platform they're considering rolling out for this).

    And, it has to be said:

    Imagine a Beowulf of these things! Heh...
  • by Anonymous Coward
    Betting Quake is the second software ported. The first being a web browser for high speen pr0n =P What else do people do with fast computers!
  • At LANL, much better things than playing Quake or surfing the web. Infact, seeing as how it's a DoD secure site, I seriously doubt they even *have* access to the web for non-research purposes; I'd be surprised if they didn't audit and account for every single packet which passed through their routers. They do all sorts of neat things there, though, such as simulations of any physical phenomena you can think of, to begin with. (Being from New Mexico, I was able to participate in the Supercomputing Challenge that they run every year back when I was in high school. Very useful experience.)

    Also, I fail to see how incredible CPU power would be used to enhance pr0n-downloading speeds. That's generally a bandwidth issue, not a CPU issue.
    ---
    "'Is not a quine' is not a quine" is a quine.

  • If they're planning on running Linux on a supercomputer, it better have "enterprise" grade features that they've been promising us -- tested, signed, sealed and delivered.

    Journaling file systems, multithreaded networking, ... categories that Linux just generally needs help with. In its current state, as much as I love Linux on the desktop and commodity-grade-hardware server, I can't even imagine it on a supercomputer -- let alone the world's fastest.

    -Chris
  • by Nate Fox ( 1271 ) on Friday October 29, 1999 @11:47PM (#1575953)
    to be installed at Los Alamos National Laboratory (LANL) in New Mexico.

    No, I'm sorry folks. That line is to be read:
    to be installed in Nate Fox's garage (NFg) in suburban Los Angeles.

    You'd think a major publication like C|Net would get thier facts straight ;)
    ...Nothing is so smiple it cant get screwed up.

    -----
    If Bill Gates had a nickel for every time Windows crashed...

  • How do you build a reliable system from thousands of CPU boards? There will be plenty of hard and soft faults in a system with that many components.
  • I'm confused. Slashdot has given me mod points over the weekend when I can only use my non-cookie browser and Lynx. I can't even *see* this story on the former, and hitting 'Off-topic' on the latter ended up giving this first post a +1. Any idea wtf is going on? Anyone?

    Anyway, I apologise for the dodgy moderating. If this post even manages to get through. Hmm.
    --
    This comment was brought to you by And Clover.

  • I bet so......

    SMP is very much on the forfront of kernel development.

    I honestly expect Linux to out perform NUMA within the year 2001.
  • I am smiling ear to ear.

    Linux running the "Worlds Fastest Super Computer".


    It seems to me that if it is good enough for the scientists at Los Alamos then ANY THING that Microsoft says about Linux will seem like FUD.

    Even if some of what they say it is true.

    Cray used to be well established as the super computer company. I remember no MS FUD about Cray. I wonder what they will say about this.


  • It would be great to have SGI addressing those issues, bringing Linux closer to "enterprise" level. Of course, I can't see it meaning too much to those of us running Linux on single or dual Pentiums, as SGI's modifications to the kernel would be aimed at the freak of a machine they'll be running it on. But then again, we can always hope for some of that work to be applied for x86 systems.
    --
  • No, they have complete faith they can manage to:

    1) Make SMP good enough for the platform they're making the cluster out of, or

    2) They're going to use 2 or maybe 4 CPU boxen (unlikely).

    Remember, SGI owns Cray. SGI by itself is a powerhouse when it comes to multiprocessing, and they have the Cray guys, who have got to be better (UniCos has to be able to support upwards of 1024 processors (see the T3Es on top500.org, 3rd fastest computer in the world has over 1k CPUs))

    I hope they decide to release the source to their patches (I don't think they have to, though, since they're modifying for internal use... But maybe they do, since they're then selling it. It can't really hurt them to release the source, so I don't doubt they will).
  • Infact, seeing as how it's a DoD secure site, I seriously doubt they even *have* access to the web for non-research purposes

    Well, after the whole Chinese spying fiasco, I imagine it IS much harder to copy files from their machines onto your laptop or vice versa. However, I know people who work there and they do in fact have access to the net, just not with "secure" machines, which I'm sure the big machines are.
    The only catch is, all the pr0n you download also gets reviewed by the IT sercurity people. D'oh!
    --

  • I apologise for being completely obvious, but try malda@slashdot.org

  • by Wakko Warner ( 324 ) on Friday October 29, 1999 @11:58PM (#1575964) Homepage Journal
    SGI's Cray line of supercomputers -- which I'd imagine this would be -- run a specialized UNIX os which is highly tuned for multiple processors (Unicos). There's absolutely no way on this planet they'd pull that out and toss Linux in, especially given that Linux can barely handle eight processors as it is, let alone *hundreds*.

    Don't get me wrong, I love Linux. But let's be realistic here. That article contained little to no factual data. The only thing they're running on is conjecture -- SGI has expressed interest in Linux in the past, so they're assuming this hundred-million-dollar multi-teraflop machine will run it? I'll believe it when I see it.

    I'll bet you, unless SGI come up with some sort of Beowulf solution instead of their time-tested Cray supercomputers, we'll be seeing yet another Unicos machine at the top of the "World's Fastest Supercomputers" list in a few years.

    - A.P.
    --


    "One World, one Web, one Program" - Microsoft promotional ad

  • You just HADDA mention the big "B-word" didn't you. Heh heh. Personally, I find it kinda funny that the fastest computer in the world resides in SNL. Just keep it away from Hanz and Franz, the Samurai, and Rob Schnieder (RED! RED-A-RINO! REDINATOR! RED!)


    Chas - The one, the only.
    THANK GOD!!!
  • ASCII Blue is not Intel it's SGI, it's a cluster of Origin 2000 Systems running Irix. Check out http://www.top500.org for the list of current fastest 500 supercomputers and their details.
  • by teraflop user ( 58792 ) on Saturday October 30, 1999 @03:48AM (#1575967)
    Wakko is absolutely right when it comes to the relative merits of Linux and Unicos on SMP or parallel machine - Linux wasn't designed for that job and won't do it. Not now, not in the near future. Unicos currently runs machine with up to 2048 processors. Linux hasn't really mastered 16.

    However, the proposed machine is bound to be a cluster, not a single machine. Linux will do this just fine. In fact Linux would run on ASCI Red if there were drivers for the networking hardware. (They've booted individual nodes with both Linux and NT).

    So there is a possibility of Linux. I guess it might be easier to use Linux on IA64 than port IRIX/Unicos.

    My reservation here is that neither the Itanic nor MIPS chips offer cutting edge performance.
  • FYI, Cray machines are made up of custom Alpha cores, and SGI has been researching Linux for a while.

    But my take is that it would consist of a large cluster of large (8-32 CPU) SMP (this is the direction SGI is heading, I mean expanding IA64 SMP environments) IA64 nodes.

    BTW, there are some non-public (not announced in the top500) installations of Linux clusters which are well above Avalon's #160 ranking.
    ________________________________________ _____________
  • Fine grained SC's are not the only way to compute nuclear (or I guess now it is anti-matter?) reactions.

    Think super-scalable, super-reliable clusters.
    _______________________________________ ______________
  • by Effugas ( 2378 ) on Saturday October 30, 1999 @05:22AM (#1575971) Homepage
    First of all, there's quite a bit of difference between fastest and fastest known. I can't imagine both the chinese AND the american governments from having some exceedingly classified hardware that blows the pants off the open stuff(read: governmental phallus-phlashing.)
    Second, the meaning of fastest is very unclear. I'd go so far to say that any system that implements a given function in software instead of hardware is going to be orders of magnitude slower than the state of the art. Witness the EFF DES cracking machine, 3D Graphics Accelerators, even Math Coprocessors. Fitting a square peg into a round hole is actually a pretty common occurance in the computer world, but it takes a relatively tortoise-like rate compared to what can be pulled off with raw gates.
    That's why XISC--Extensible Instruction Set Computing--is probably the upcoming processor paradigm. Programmers need the ability to redefine round holes into square ones, so the square pegs fit right in.

    Yours Truly,
    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com
  • What if you counted the combined processing power of the Seti@home clients. Would that even crack the top 100?
  • by copito ( 1846 )
    Actually I kind of drifted into talking about Big Iron rather than supercomputers. Clearly they have different requirements, and one is willing to do much more tweaking and spend more money on a cutting edge machine. For Big Iron, one is worried about such boring things as support, uptime, reliability, cost etc.
    --
  • This is an area where I have a pretty good amount of knowledge, and I think a few clarifications need to be made.

    First off, this has absolutely nothing to do with the Cray division (which several people, including Hemos seem to think). This project, and the current ASCI Blue Mountain project, are both built from SGI's Origin line of servers (the SN1 is the next generation of this). Cray's unit only works with the Vector supercomputers. Also remember (from August) that SGI is going to be getting rid of this division.

    Second, I'd like to point out that this article is really just speculation. Read it closely, they take a couple of facts - SGI is trying to get this contract, they are working on ramping up the scalability of Linux, and the new SN1 servers will be eventually based on the Itanium - and they try to draw a conclusion that Linux might be what is run on this supercomputer.

    Now, I'm not saying that this isn't a valid argument - but Linux as an operating system has a LONG way to go before it supports the massive number of processors and amount of memory we're talking about here (Blue Mountain has 6144 processors). There is still a lot that Linux is missing.

    This does not necessarily mean that SGI can't get it that far, especially with its experience in scalable OSes. I would love to see them do it. But when you already have an architecture and OS that works running on the Blue Mountain configuration, it would be going quite a bit out of their way. So, until I hear SGI themselves say "we're running Linux on T30", I'm going to be skeptical.

    If they DID - hey, that'd be a GREAT push for Linux. Lets hope they go for it.

  • Yes there will be hard and soft faults in a large experimental system. That is understood by you and me and SGI and the customer. You don't push the limits with a tried and true platform. You push the envelope when you first try a technology.

    SGI will likely be able to bring redundant processors and subsystems online (hotswapable) as needed. Software becomes stable over time on these types of systems. The key is that SGI support will have access to all source whenever they need it.
  • I'm one of the SysAdmins for the Centurion cluster at UVa, as well as a student in Andrew Grimshaw 's (professor who built Centurion) Operating Systems class. First of all, I'll play the part of Greg Lindahl briefly and say that Centurion is technically not a Beowulf cluster. AFAIK, part of what defines a Beowulf-class machine is one or more head nodes -- usually one -- which dispatch jobs to multiple client nodes. This is somewhat like Asymmetric Multiprocessing, in which there is a master processor which runs the OS and dispatches jobs to the slave processors. The head node(s) has more processing power, memory, etc, in order to be able to manage the other nodes. The nodes in the cluster itself are usually of homogenous composition running some freenix, usually Linux. Centurion itself consists of 128 DEC 21164 Alphas and 128 dual PII-400's, all running Linux. There are a few of us just itching to try and run LINPACK on it. :) There are also several assorted machines which serve as frontents, running anything from FreeBSD to Solaris to (ick) IRIX. There is no head node which dispatches jobs, and each node is independent of the other (no sub-clusters within the cluster). I know I'm currently stepping on a lot of toes and re-hashing a lot of info, so visit the Beowulf FAQ. Kragen's done a great job of gathering info, and it's a good read.

    Secondly, Professor Grimshaw discussed the PetaFLOP project the other day, in which the LANL project is a stepping stone in. If you shell out enough money, you can have a GigaFLOP machine on your desk. If you shell out even more money, you can have a TeraFLOP machine in the raised-floor room with tons of A/C at your research center. The challenge now is to bump it up another 3 orders of magnitude. By combining SMP nodes with a message passing interface or some other form of managing distributed memory, LANL hopes to build this 30 TF machine.

    However, SGI may not get the bid as C|NET reports. When the gov't spec'ed out the machines they want to have as nodes they requested 16-node processors. So let's look at the Big Boys of horsepower:

    • Sun: Has built large systems before, but where they have the strongest performance is 64-way SMP, not 16. This is impressive, but scaling down can be as difficult as scaling up at times.
    • DEC: Alphas rule. Period. The 21264 is an awesome machine, and its performance isn't just limited to great benchmarks. A dual-666^H7 is probably the most impressive piece of machinery you can get for your desktop (if you can spare the $10K+). However, DEC doesn't have a lot of experience building nodes any larger than 8. There are some 14-way machines available, but it still doesn't meet the spec.
    • SGI: Another great company, pros and cons discussed in the article. But will they be around 2 years from now when this machine needs to be built? SGI's taken a serious downhill slide recently, and although they have experience building great 16-processor machines, LANL may not want to risk funding a company in such a state.
      Which leaves ...
    • IBM: Great hardware, is really on the upswing in the last few years, and has experience with 16-way boxes. A definite candidate for the contract.

    Just my 2 drachmas ...
    -OWJones
  • Oops, sorry, my bad. I knew that SNL was DoE, but thought LANL was DoD. Regardless, though, they still have very tight information auditing, especially since most of the systems there would most likely be Q-level or higher, right? I mean, just to get onto the facility you need an L clearance (which is, of course, par for the course).
    ---
    "'Is not a quine' is not a quine" is a quine.
  • Of course this is a pipe dream, the story is complete rumor-fodder. It's like posting news about a possible movie that should be out in 2001. I mean, we all know the world is going to end in a couple of months anyhow, right? :)

    ASCI Red - about 10,000 PPro 200's. Okay. #1.

    Linux works on more than 'two' processors on x86 and Alpha, and I'm sure a lot of redesigning (that you're not going to hear about until IA/64 comes out) is going on right now under NDA. I hope SGI is currently working with them on that, but either way that should be resolved. Of course, we can only hope that by 2001, IA/64 is the greatest invention since sliced bread, otherwise we'll be looking to x86, Alpha, PPC, and anyone else who thinks they have a "good idea for a chip". :)

    However, it's the nature of clustering that I don't think they'll have thousands of processors all in the same box. And Linux is a proven clustering solution, as evinced by it's entries in the TOP500. It's just a little newer and a lot cheaper than IBM.

    Also, Linux clustering solutions generally use Commodity Off-The-Shelf Hardware. That's the point: it's cheaper that way. So don't brag about how IBM uses a standard, proven design. So does everybody else. (I admit, though, IBM is somewhat high up in the rankings, just like SGI. :)

    #2 on the list, ASCI Blue *Mountain* looks like it's held by SGI.

    Ooo, #20 has 200 MHz processors. So does #1! The MHz don't matter! The fact that #20 has 768 processors total might be a *little* bit more important.

    And this is an other reason why this makes sense. Anything that outperforms ASCI Red would need more than 1000 processors, since it's 10,000 PPro 200's. Just on processor power and MHz alone, you'd want, say, 2,000 1GHz processors. However, if IA/64 offers the sort of speedup expected by a completely new architecture including weird optimizing compilers, and processor speeds continue to increase as usual, maybe they can do it with 1,000 1.4GHz processors or something.

    Also, from the article, it didn't sound like they needed to beat ASCI Red right off, just eventually. They admitted that the technology needs time to mature. And that isn't just Linux. It's IA/64 development, and SGI's x86 porting efforts too.

    I'm willing to wait two years to see what happens. Everything you claim happening in two years would be somewhat unlikely. However, I don't think the article required that, and I wouldn't blame it all on Linux. Besides, the OS on the nodes will probably be pretty customized and stripped-down anyhow. They're just supposed to be computational workhorses.
    ---
    pb Reply rather than vaguely moderate me.
  • by treke ( 62626 )
    He probable meant that ASCI Red was made by intel
    treke
  • Security is obviously very important, but not all of the whiz-bang systems are used for classified computing, so no, you don't necessarily need a clearance to use some of them. And many areas don't require a clearance for entry. The Lab employs many foreign nationals, after all.
  • What are the IBM boys going to say? No big deal for them and another gigantic, overpriced SP
  • Heh heh. I think this'll put the scalability and HA gripes to rest.


    Chas - The one, the only.
    THANK GOD!!!
  • I disagree with the Cray thing, if my understanding of the problems with their Cray line being correct (it's not making them any money).

    Also, LANL has a large Beowulf cluster (See Avalon, an Alpha based cluster, #160 in the top500), and a small Beowulf cluster (See Loki, a PPro based cluster which has won some awards for advancements in parallel computing or something). They also have #2, which I think is an Intel (Maybe it's SGI... I forget) designed cluster of boxen. They seem to lean towards commodity hardware and clustering over the "One massive box" method that a Cray machine would bring.
  • ...that Linux has now successfully captured the "high-end" computing marketplace? ;)

    30 Teraflops sounds nice and crunchy.

    Linux: It's what's for dinner.
  • A supercomputer is not the same thing as an enterprise server, far from it, and as such a
    supercomputer can have very different requirements - often directly related to the precise task it is made for and the architectural decisions taken when designing/building the supercomputer.

    An example: You mention journaled filesystems. A supercomputer doesn't neccesarily need large capacity storage or quick failure recovery. It may just need a small fast storage instead, or maybe none at all.

    Many supercomputers today are built using Linux.
  • Just incase you're no businessmen, here's the translation:
    "yes, ofcourse we'll do it" - ummm, maybe, depends on lots of factors.
    "it's possible" - but we won't do it, we might say it for publicity though.
    "we can't do that! it's impossible" - not applicable, that sentence does not exist in business language.

    that really puts 'confirmed the possibility' in another light, doesn't it? but ask SGI if it's possible that it will run irix or unicos, or windows, and they'll confirm it's "possible" aswell :)
  • I believe that distributed.net was about 100 times the speed of ASCI RED about 6 months ago...

  • Maybe if you renamed it EarthQuake you could get processor time, although I doubt the beast has a good 3D graphics card much less a mouse port.
    --
  • SGI has not announced that they are going to put linux on this beast, only that it was "a possibility". Given that the Irix OS scales well beyond Linux currently I doubt they'd replace it on such short notice. The problems with Irix, namely nonstandard library layout and bad security are non-issues in a supercomputer environment.

    In related news, Microsoft has confirmed that it is "possible" that WindowsCE will be used for critical life support systems in upcoming space missions. I mean come on. Neither is strictly speaking impossible, but I'd take it with a large grain of NaCl.
    --
  • by hey! ( 33014 ) on Saturday October 30, 1999 @06:33AM (#1575996) Homepage Journal
    It depends on the dimension you're scaling in. Parallel procesing power is the easiest to scale in.

    Mainframes aren't that much more powerful than desktops in processing power, but they are much more powerful in terms of I/O bandwidth and storage capacity.

    The scalability issue that most people are talking about is scalability on an enterprise network in number of users and diversity of missions. The scalability challenge in those dimensions is manageability. The MS argument about scalability is basically that an enterprise can manage its IT assets more cheaply on NT (and manifestly looks easier to a PHB because you use the familiar windows GUI).

    However, I think most people have figured out by now that the "user friendliness" of Windows is basically a cardboard facade put up on a big honking hunk of complexity. I think Linux (as well as other Unices) has the opposite problem in that a lot of its utilities appear unneccesarily complicated, but the underlying system is much cleaner and more modular. It would be cool if every package adopted the same scheme for its configuration files, perhaps XML based:

    &lt netdef &gt
    &lt netname &gt sales-dept-subnet &lt /netname &gt
    &lt ipnetdef &gt
    &lt ipnetaddr &gt 192.168.0.64 &lt /ipnetaddr &gt
    &lt ipnetbits &gt 26 &lt /ipnetbits &gt
    &lt /ipnetdef &gt &lt /netdef &gt

    These could be manipulated with any combination of GUI tools, Web tools, command line tools, or even special YACC grammars purpose built to your enterprise network.
  • I highly doubt that money is the big question here. When building this class of machines, performance is all that matters - if IRIX had a better speed potential, they would go that way. LANL will have to pay SGI a ton of money for the performance-tweaking that will have to be done to linux anyway...

    In this respect, i think open source is a big hit - a developer of supercomputers gets some source code to build on and can change anything needed to increase the speed. If (s)he was to buy IRIX or similar, none of that would be easily available.
  • Check out http://www.gapcon.com/listg.html
    This list of the top supercomputer sites is as close as you can get to up-to-date and authoritative in that field.

    You will notice that there are no Linux installations in that list. Linux on a supercomputer has not been proven to be viable for the highest end systems yet. What happens if SGI fails to deliver? The box may be installed, it may boot, but what happens to Linux's reputation if the system can't fulfill it's mission.

    Also, keep in mind that SGI does need some good press. You could say that they are desperate for good press right now.
  • then it means that the conspiracy stories about how easily the NSA cracks DES codes must be wrong.

    If distributed.net, as you say, is 100 times faster than the world's fastest and it's still taking this zarking long to crack 64-bit DES ...

    PGP should be safe for a while more.

  • I'll agree that "enterprise ready" is becoming an overused buzzword. On the other hand, I cant help but agree that for many enterprise-level applications, Linux just is not ready yet. I see the following items as a requirement to get there before I would feel truely comfortable using Linux as in these environments:
    1. Journalling File System (JFS) - Its no good to run linux on your 10 terrabyte file server if it takes 2 days to fsck your disk array.
    2. Good stable LVM implementation with the ability to grow/shrink mounted filesystems on the fly.
    3. Good failover support (something like HP-UX's serviceguard).
    LVM, and JFS are close (Reiserfs has JFS in beta, I think ext3fs is close as well. LVM seems close to being stable, but dont know if we will see any of these into v2.4). So these features are coming. But right now, I still don't feel comfortable with Linux in an environment where these features are critical.

  • The biggest Linux box/cluster/whatever is Avalon I believe, currently ranked #160, and also resides in LANL. Wasn't there one in the 50-60 range as well?

    CPLANT. A US based lab made at least one 150 node Alpha Linux cluster, I forget what was different (better) that made it much faster, but it cracked higher into the top 100 than Avalon ever did. Last I checked it was at about 120.
  • Actually upon rereading the article I could only find that SGI had confirmed that they are making a bid, not that Linux is a possibility. The rest of the article is rampant speculation.

    Or I could be wrong again, in which case chalk one up to a late night.
    --
  • If this get's through, it will finally make the PHB's take linux seriously. they just couldn't ignore the OS that runs the fastest computer in the world.


    ---
  • Yea, I guess Etoys, Dejia News and slashdot are not "enterprise" ready.

    Seriously though, why is this "enterprise" myth so popular? Linux seems to work fine for those enterprises as well as the UK government.
  • PS- deduct more Karma because I own the windowssucks.com domain.
  • by copito ( 1846 ) on Saturday October 30, 1999 @12:56AM (#1576009)
    SGI: We are making a bid for the T30 supercomputer
    CNET: What OS will you be using?
    SGI: No comment.
    CNET: (aha!) So what CPU will you be using?
    SGI: No comment.
    CNET: (Aha!) Can you confirm that you will be using Linux.
    SGI: No comment.
    CNET: AHA!!!!! (Whoops did I say that out loud?)

    Later...

    CNET: (Damn, I forgot to ask about alien technology, I guess I'll just go with the Linux angle. I'll throw some "experts" in there for "balance". Slashdot readers will read it....I'll be rich) Rich I tell you!! haHaHAHAHAHAHAH!!!!!! (cough)

    --

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...