Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Japan's Petaflop Supercomputer 161

slashthedot writes "Japan has built the fastest supercomputer in the world. While the BlueGene/L contains 130,000 processors, Japan has managed to create the first Petaflop supercomputer, called MDGrape-3, with just 4808 chips, and it cost just $9 million to develop."
This discussion has been archived. No new comments can be posted.

Japan's Petaflop Supercomputer

Comments Filter:
  • Wow (Score:5, Funny)

    by 9x320 ( 987156 ) on Sunday July 30, 2006 @09:40AM (#15810779)
    Making that computer must have been harder than getting a story from MSN posted on the main page of Slashdot!
  • Progress (Score:5, Informative)

    by Eightyford ( 893696 ) on Sunday July 30, 2006 @09:44AM (#15810797) Homepage
    It now costs 15 dollars per gigaflop. In the early 90s, a million dollars per gigaflop was normal.
    • But, you still can't get 100 gigaflops for 1,500 dollars. :(
      • Re:1,500 $ (Score:3, Insightful)

        by Eightyford ( 893696 )
        But, you still can't get 100 gigaflops for 1,500 dollars. :(
        I'm sure Sony's PS3 will be advertised as having 1000 gigaflops for a few hundred dollars.
        • It's actually 2000 GFlops for a few hundred dollars.


          And calling this a Petaflop supercomputer is similarly misleading, for roughly the same reason. The PS3 gets its 2 TF from the GPU, which can process 384 flops per cycle in an architecture built specifically to shade pixels. Likewise this MDGrape-3 is built at the hardware level to solve the n-Body problem, and that's it.

      • Re:1,500 $ (Score:2, Interesting)

        by smallfries ( 601545 )
        No? 'cos a GTX-7800 does 320Gflop/s and you could buy a few of those for $1500...
        • No? 'cos a GTX-7800 does 320Gflop/s and you could buy a few of those for $1500...

          True - but we're talking general purpose operations here.
          • Well sure we could be. The actual supercomputer in the article isn't a general purpose machine although it does run the Linpack. They managed to get such a high performance by limiting the operations, much like a GPU. In more general terms a processor capable of 8Gflop/s can be had for about $100 - so general purpose flops would be about 120Gflop/s for the $1500. Not quite as impressive but still quite high...
    • Comment removed based on user account deletion
      • Most of that power is in the GPU, and GPUs are extremely specialized and at the present are not very good at much of anything but graphics processing.
      • I have a terrible feeling that I'm wrong, but will anyone be able to correct me?

        Sure.

        RSX: You could put a more powerful GPU into a PC and get better performance numbers, so why count GPU performance power? Also, you cannot do 64-bit floating-point math with ANY GPU at the moment, and has non-IEEE-standard accuracy, so remove it from the equation.

        Each SPE can do 25.6 Gflop/s theoretical (180 Gflop/s for all 7), but only for 32-bit (non IEEE-standard) values. For 64-bit accuracy, tests have shown the thorou
    • Check out the company Mathstar ( http://www.mathstar.com/ [mathstar.com]). They just taped out a chip the other day that when it comes to market will do about 500 Gflops a chip. The technology is quite incredable and although it is not specifically a general purpose chip the chip can be programed to work in any way you like allowing you to get max preformance for the applications that you need to run. Honestly I would like to get a hold of about, say, 50 of these and see what I could make them do in parallel (as they ar
    • From Page 2 of TFA

      No other supercomputer at the top of the rankings can muster so much calculating brawn on such a tiny budget. That's partly because MDGrape-3 relies on fewer chips and less circuitry than rivals. It's also because the chief scientist, It's also because the chief scientist, Dr. Makoto Taiji, working with only two other researchers, had plenty of help from Hitachi, Intel, and NEC subsidiary SGI Japan.

      Those companies supplied the hardware -- Hitachi made the central processing unit, or CPU --

  • machines like this (Score:2, Interesting)

    by Neuropol ( 665537 )
    should be used in conjunction with the topic from the previous article. Creating coutless means by which, to not only find vulnerabilities in things like Javascript, but equally, construct fixes to those vulnerabilities. Once it creates an open door, it generates the fix for closing it and keeping it closed. Machines like this can think thousands of times faster than your average black-hat-crackah, so why not use them as a fight fire with fire tool?

    Every one is so concerned with internet safety, on would
    • by x2A ( 858210 ) on Sunday July 30, 2006 @09:54AM (#15810836)
      Having a computer do something very very fast is only of any use if you have the software to do what you want done very very fast. As far as I know, the hard part of what you suggest is writing such capable software, not running it.

      • agreed. as i think about it more, i feel that a computer like this would need to have that 'AI-like' tempmerment that would allow for the active 'thought' process to be learning what to check for vulnerabilities. A whole lot of if-then.

        If the resources are available to crack rc5, to do distributed based work on a cure for cancer, and crunch data captured from radio antennas in search of little green men from mars, then I think we have the know-how necessary get some thing like this up and running.

        It mak
        • by NewbieProgrammerMan ( 558327 ) on Sunday July 30, 2006 @10:43AM (#15811052)
          If the resources are available to crack rc5, to do distributed based work on a cure for cancer, and crunch data captured from radio antennas in search of little green men from mars, then I think we have the know-how necessary get some thing like this up and running.

          Well the examples that you mention are not really the same as "attempting to break software and search for problems long before release." If I understand these issues correctly: (1) (with apologies to crypto specialists) RC5 cracking required lots of CPU time to factor a big-ass number, (2) projects like Folding@Home aren't "looking for a cure for cancer," they're running (I think) quantum chemistry simulations to find out how certain molecules can act in certain situations, and (3) SETI@Home is looking for specific patterns in signal data. In all three of these cases, there's a few common (maybe not so simple) operations that need to be applied to a large set of data or initial conditions, and that's why they need lots of machines, or fast machines.

          Figuring out how clever people will take advantage of a particular implementation of a web browser or TCP/IP stack is a completely different class of problem IMHO. Yeah, maybe there's some clever AI techniques that may simulate attack attempts, and maybe they could come up with attacks that nobody has thought of yet, but a really fast computer will not somehow magically solve these kinds of problems for us. There's a lot of hard science and software engineering that needs to be done first.

        • There's a huge difference between searching for unknown problems in unknown places, and searching for solutions to set problems. The latter involves running a set function with different input values, and checking the result. The former involves several layer deep multi-property pattern recognition, and nothing comes anywhere near the brain at doing this.

    • You're wrong here.

      It's not about being fast. It's about creative ways to do things that interfaces weren't intentended for.
      Your idea would work out as soon as you have a way to replace artists with computers.
    • Cancer research sounds a little better than preventing-your-browser-from-misbehaving research. But at only 9 mil a piece, why not both?

      In fact, you could put thousands of these machines together for less than 10 billion. For 10 billion dollars you could crack any reversible cryptographic algorithm in the universe on a weekend. I call that world domination.

      Maybe Gates still has interesting things to do with his life after all.
  • Efficiency (Score:3, Interesting)

    by Eightyford ( 893696 ) on Sunday July 30, 2006 @09:46AM (#15810807) Homepage
    The article says that this machine is much more efficient than other supercomputers. Is it actually cheaper to run large programs like SETI@HOME on a supercomputer? Electricity isn't cheap.
    • From what I've heard about this particular petaflop supercomputer in the past is that it isn't a general purpose computer even in principle. It's built for a special purpose and that single purpose is what it can do at petaflop speeds. Nothing else. BlueGene and those in the same range are a bit more general purpose, if you could call it that.
    • Well, in the case of SETI@Home, it wouldn't be cheaper to run on the supercomputer - SETI isn't paying for the power to run all those CPUs out there in people's homes and offices.
      • Right - it splits the cost among a large group of people, rather than having one organization pay for supercomputer time. Of course, they could try to fund this through donations, but I think that people are more likely to run a program for SETI than to start sending them checks.
    • Re:Efficiency (Score:3, Interesting)

      Is it actually cheaper to run large programs like SETI@HOME on a supercomputer?

      This computer is efficient at what it does largely because it's extremely specialized. It's built specifically for working on molecular dynamics, but from the looks of things, it's probably close to useless for nearly anything else.

      As such, it would probably work quite nicely for Stanford's folding@home project (which studies protein folding, i.e. molecular dynamics). It probably would not work very well for seti@home, bec

      • Re:Efficiency (Score:4, Insightful)

        by Duncan3 ( 10537 ) on Sunday July 30, 2006 @12:12PM (#15811551) Homepage
        To put that into perspective, consider that the Blue Gene/L has 65536 processors. seti@home has over a million hosts and folding@home has a couple hundred thousand more.
        Try comparing active hosts to active host. SETI "active" means anyone they have ever seen, and always has. Just compare TFLOPS. Folding@home has been larger for a very long time, tho SETI may be catching up, depending on how much you bend their stats.

        Of course, if you compare USEFUL results, it's Folding@home: lots (over 50 papers), SETI: 0

        The Japan box will be faster for a little while then Folding@home, but will also likely produce RESULTS instead of just alot of global warming.
        • Try comparing active hosts to active host. SETI "active" means anyone they have ever seen, and always has.

          Ah, I wasn't aware of that -- I mentioned SET primarily because the OP did. My own spare cycles all go to F@H...

          Of course, if you compare USEFUL results, it's Folding@home: lots (over 50 papers), SETI: 0

          Quite true -- and IMO, likely to remain that way (and thus, my decision about where to contribute...)

  • Incorrect chip count (Score:5, Informative)

    by Bushcat ( 615449 ) on Sunday July 30, 2006 @09:48AM (#15810814)
    The original article seems to be unreachable, so I can't read it, but the precis has the wrong chip count: It does have 4808 LSI chips, but it also has 19,122 Xeon processors.
    • I read the article - don't waste your time. No doubt it's a cool machine, but the artile was the flimsiest puff-piece I've ever seen linked on Slashdot. Complete lack of technical detail, moron-level explainations of common terms - I feel stupider having read it.

      Are there any good articles on this machine that anyone would care to share?
      • by rgravina ( 520410 ) on Sunday July 30, 2006 @10:13AM (#15810902)
        This article here from Riken themselves has some more technical details:

        http://mdgrape.gsc.riken.jp/modules/tinyd0/index.p hp [riken.jp]
        • thank you for this one!

          As someone else already said, and mentioned in Parent's link, this is a very specific machine, for Molecular Dynamics simulations, everything from memory handling to processing is optimized only for handlig particles and doing force calculations on them. Therefore, it'll serve a relatively small market.

          That said, I'm very curious to see how fast it'll run gromacs [gromacs.org], the MD program I use. This is pretty optimized for parallel simulations already, and I'm able to do the calculations I

        • Their custom chip draws 19 W at 350 MHz(fastest) or 16 W at 250 MHz(typical)using 130 nanometer tech.

          I wonder how much lower they could have pushed the power draw by using a 90nm or 65nm fab?

          note: The system "cost" $9 mil because... that's what their budget was. The chip builders ate some of the cost.

  • by ZachPruckowski ( 918562 ) <zachary.pruckowski@gmail.com> on Sunday July 30, 2006 @09:48AM (#15810816)
    Will this run Vista at a decent speed, or should I wait for the Rev B and SP1?
  • by StarWreck ( 695075 ) on Sunday July 30, 2006 @09:51AM (#15810821) Homepage Journal
    If this petaflop supercomputer really only costs $9 million and only occupies the space of a large walk-in closet, why don't they mass-produce it and sell it. No, not to individuals but to corporations and governments. Folding@Home and Seti@Home could suddenly be like, sorry guys we don't need you anymore - we got something better. Having hundreds of copies of this super computer could quickly solve problems across the globe that much slower supercomputers are currently having trouble with!
    • Having hundreds of copies of this super computer could quickly solve problems across the globe that much slower supercomputers are currently having trouble with!

      Because nobody is writing paralleisable code, or if you like, computer languages don't readily support multi-threaded code. It's always a construct verging on a hack that frequently goes horribly, horribly wrong. Until multi-threading in languages is as seamless and usable as calling a sub routin, parallel computing will never take off.
      • Er, all the computers on the Top500 run parallelized code.

        Typically, they use libraries (not built-in language features) to do it.

        And it's not done using multi-threading.

        What isn't that common yet is consumer apps that are parallelized. Scientific apps got there a decade ago.
    • This is far from a general purpose supercomputer. If you read the more technical article at http://mdgrape.gsc.riken.jp/modules/tinyd0/index.p hp [riken.jp] you will see that this thing is designed from the ground up to do molecular dynamics. So while folding@home might be able to make some use out of it, none of the other distributed projects would.
    • why don't they mass-produce it and sell it?

      The cost of this computer is actually much higher than $9 million. If you rtfa, you'll see that much of the computer was effectively donated by outside companies. The CPU design was done by Hitachi. Intel supplied other hardware as well as SGI Japan. None of this is factored into the $9 mil. It's likely that the actual cost was many multiples of that.

  • by davidwr ( 791652 ) on Sunday July 30, 2006 @09:53AM (#15810828) Homepage Journal
    NOT what the VP of Marketing wants to hear:

    "Not just a flop, but a flop a million billion times over."
  • by john_uy ( 187459 ) on Sunday July 30, 2006 @09:55AM (#15810839)
    the supercomputer is quite cheap. they can probably sell a lot of these machines and will sweep the top500 list. however, it mentioned that the processor is specialized in doing astrophysics calculation. i am not sure if this will be useful for other fields.

    but the good think about it is that it is more energy efficient. it seems the trend in desktop/servers right now are also going to the supercomputers. maybe they could include a performance per watt ratio in the top500 list as well.
    • Specialised (Score:3, Informative)

      by SamAdam3d ( 818241 )
      The problem with that is that this computer is very specialised to molecular simulations. It can't very easily do other things, like seti or folding (okay, well, maybe that it can do). It was easy to design and cheap because it didn't have to be general purpose and adaptable, like BlueGene/L is.
  • Say what?!? (Score:3, Informative)

    by mosel-saar-ruwer ( 732341 ) on Sunday July 30, 2006 @09:57AM (#15810844)

    Japan has managed to create the first Petaflop supercomputer, called MDGrape-3, with just 4808 chips...

    FLOP = floating operation [per second].

    PETA = 10 ^ 15, or "a quadrillion".

    (10 ^ 15) / 4808 = about 207,986,688,852, which would indicate that each chip is running at several hundred TERA-hertz [and, even then, the machine would have to possess an operating system so efficient that it could consistently perform one floating point operation per clock increment, which seems extraordinarily unlikely].

    Or is this an "analog" computer and are these "analog" FLOPS?

    And no, I did not RTFA.

    • Re:Say what?!? (Score:5, Informative)

      by hattig ( 47930 ) on Sunday July 30, 2006 @10:11AM (#15810891) Journal
      The Cell processor can do ~200 GFLOPS - not IEEE quality FLOPS however, however they're 'good enough single precision FLOPs' for it's target uses. This is probably why this new supercomputer won't get into the Top500 list, because it's very specialised and thus probably nowhere near as good at IEEE conformant calculations.

      The Cell processor is not running at 200GHz. There's this concept called 'parallelisation', it's how your graphics card can do dozens, if not hundreds, of operations per clock cycle. In Cell's case it can do 8 (number of SPUs) * 4 (128-bit registers, SIMD) * 2 (units) = 64 SP FLOPS per clock cycle, and that's not including the PPU which has VMX128 and an FPU itself.

      However make the Cell processor calculate IEEE conformant FLOPS, and it gets a double precision score of around 20GFLOPS. Still good though.

      The above was from memory, details may vary, figures are roughly correct, YMMV, etc.
      • Very good explanation. You could even compare this to the Human brain, which only operates at about 50Hz (if I remember my AI class properly) but can have every single one of the trillions of Neurons doing its own little threshold calculation. Granted, it's difficult to compare Neural nets to non-linear circuit systems in a meaningful way, but it does demonstrate the ridiculous extreme of parallelisation.
    • You assume 1 core per chip, it's quite possible that they have several cores per chip. Chips with 4 cores are now common, 8's on the horizon and 16's in the lab. Each CPU was special built for astrophysics calculations (not sure what that means..seems to me just to be lots of floating point) by Hitachi which absorbed the cost of the CPU development. Also, the chips may be able to work somewhat in parallel if the software is written that way which obviously will increase performance. So, I don't doubt the fi
    • Re:Say what?!? (Score:3, Interesting)

      by bloosqr ( 33593 )
      Yea its specialized hardware, the mdgrape basically calculates Newton's law in the hardware so it does the inverse ^2 calculation really super fast. There used to be a md-grape equivalent which did the same thing for coulombs law (as you would think there is more money in doing biosims than astrosims), but i think that died as the market was too small.

      I think this was an ibm/fujitsu collaboration and ibm had md-grape and dropped it because of the market and fujitsu is still making the grape..

      FYI the reaso
    • Re:Say what?!? (Score:4, Informative)

      by Hollinger ( 16202 ) <michael AT hollinger DOT net> on Sunday July 30, 2006 @11:06AM (#15811150) Homepage Journal
      Yeah, it's a bit obvious that you didn't.

      Quoting another link you can see how they reached these numbers (which I take issue with):
      The following figure shows the block diagram of the MDGRAPE-3 chip. It consists of 20 force calculation pipelines, a j-particle memory unit, a cell-index controller, a master controller, and a force summation unit. The force calculation pipeline is the most important part of the chip which performs calculations of two-body forces such as Coulomb and van der Waals forces. Each pipeline performs 33 equivalent floating point operations per cycle when it calculates Coulomb force. Thus, when it operates at 250 MHz its performance will reach 165 Gflops with 20 pipelines. The chip also has the j-particle memory unit, which corresponds to the main memory of the CPU. Therefore, no extra memory is needed to attached with the chip.

      - http://mdgrape.gsc.riken.jp/modules/tinyd0/index.p hp [riken.jp]

      With that answered, I'm confused. Another poster sent along that link which explains what Riken will do. I'm confused about that actually. Reading the page, based on the verb usage, either someone didn't understand future and past tense (possible, but unlikely), or they haven't built the entire box yet. Perhaps I'm reading a bit too much into it... it's quite possible that someone simply hasn't updated the website.

      Based on the webpage, all of the calculations to reach 1 petaflop are based on theoretical peak performance measurements, extrapolated from the theoretical peak of a single special-purpose ASIC which has been built, but may or may not have been actually placed into a fully configured system. Nothing talks about measured benchmarks, and the OP's article contains the same theoretical extrapolated numbers.

      Anyone know if they've actually built it?

      ~ Mike
    • giga not tera (Score:4, Insightful)

      by tetromino ( 807969 ) on Sunday July 30, 2006 @11:27AM (#15811276)
      (10^15)/4808 = 207 986 688 852, i.e. ~208 billion flops, i.e. if the chip executed only 1 instruction per clock, it would be 208GHz (not THz as you imply). Except of course the chip does more than 1 instruction per clock. Modern x86 chips do multiple flops per cycle. A Cell should be able to do at least 9 per cycle. I imagine that a dedicated vector processor, of the sort that NEC used to make, can do tens of flops per cycle.

      Furthermore, many processor architectures have instructions to do several basic floating point instruction in one step. For instance, PowerPC has a one-cycle multiply-accumulate instruction (multiply and add in one step), so for marketing purposes, a PowerPC has twice the flops. Now, imagine if you have a vector processor that has a highly-optimized instruction for taking square roots or doing trig in one cycle. A square root operation will translate into dozens of basic flops (add, multiply, subtract). Such a processor might therefore be rated at 208 gigaflops even though its operating frequency is <1GHz.
      • Modern x86 chips do multiple flops per cycle
        I love acronyms that are explained incorrectly..
        Floating Point Operations Per Second per cycle

        If you assume that the reader doesn't know the meaning, then just write it out to begin with. :)
    • http://slashdot.org/comments.pl?sid=06/07/30/1382 3 4&threshold=1&commentsort=0&mode=nested&cid=158108 14 [slashdot.org]

      19,122 Xeons.

      (1 * 10 ^ 15) / (2 * 10 ^ 4 ) = 5 * 10^10.

      That's 50 billion floating-point operations per second. If each Xeon is dual-core, it's 25 billion ops per core per second. If they're 4GHz processors, then it's 6.1 ops/cycle. I'm not sure how it achieves that. Even multiply-add fused instructions only do 2 ops per cycle.

      I still have to ask if this is achiveable.
    • (10 ^ 15) / 4808 = about 207,986,688,852, which would indicate that each chip is running at several hundred TERA-hertz

      It implies nothing of the sort. A single chip could have several floating point pipelines.
    • Each of 5832 300-MHz execution units does 660 parallel
      floating-point operations per cycle, for 1.15 e 15 flops/sec.
      The Xeons do not contribute to the total; they essentially
      act as the microcode program that tells the vector units
      what to do next.

      While optimized for moldyn, it would be readily repurposed
      for a wide range of large-scale computations, including
      solving massive ensembles of linear systems. Indeed, I
      would be quite pleased to write a Fortran-2005 compiler or
      a Matlab compiler for this beast, if anyo
  • Does that mean its a giant cluster of unwanted aibos?
    • No, a Petaflop is when an animal rights activist throws themselves in the path of a fishing trawler, cattle car or some other vehicle used in the meat or fur industry. It is similar to, but not quite the same as the terraflop which is more used in anti-logging activities.
  • 9 million? (Score:4, Insightful)

    by jacklebot ( 620766 ) on Sunday July 30, 2006 @10:04AM (#15810870)
    Great. 9 million dollars to build the thing, 15 million dollars to build the infastructure to power and cool it, probably.
    • Re:9 million? (Score:3, Insightful)

      by AC-x ( 735297 )
      "Riken's machine occupies the space of a large walk-in closet and is an energy-sipper"

      Remember the green cross code: Stop, Read, then Post.
  • Nuff said.

    Where are the really neato results we should be getting from these? I'm tired of "Country X builds massive TeraWatt computer system." I want to read about "Country X mapped the cancer genome" or some such.

    Besides, these are relatively not impressive. Sure in the 50s, 60s, 70s, 80s we were maturing the technology. Inventing new technology, analyzing it, etc. Now it's more of the same. Huge budget, lots of space and infiniband connections...

    Show me the MFlops/Watt rating of this? Are they improving it? Are we wasting less resources? The irony of this is they pollute by wasting tons of energy, all so we can predict global warming or whatever.

    Tom
    • "Show me the MFlops/Watt rating of this?"

      No problemo!

      The number of flops: (10 ^ 15) / 4808 = about 207,986,688,852 flops per chip, - from a previous poster.
      The number of watts: 300,000 - from the manufacturers' site = 62 watts/chip
      207,986,688,852 / 62 = 33,546,240 flops (33 MFlops) / watt.

    • Your "Grumpy Old Man" impression is passable, but it's nowhere near as funny as Dana Carvey's was.
    • by NewbieProgrammerMan ( 558327 ) on Sunday July 30, 2006 @10:58AM (#15811125)
      Oh, please. This machine only uses 300kW - that's maybe the equivalent of 150 American homes. These folks are building a specialized (as in not "more of the same") machine to support a particular bit of science (molecular dynamics simulations) that isn't gonna make for flashy headlines, and I say more power to them. I'd rather there were more scientists out there doing basic research that may actually be useful, than have them chasing after stuff for headlines that will make you happy.

      And if you're trolling, yeah, you got me, so congratulations.
  • "the machine may be ineligible because of its specialized hardware"
    What specialized hardware? I would really like to read a more technical article about this machine. I would guess that the Japanese focused on vector processing like they did in the design of the Earth-Simulator [wired.com].

    The best supporting evidence I have for this conclusion is the comparison of Japan's last two supercomputers:
    Sun Fire X64 Cluster [top500.org]
    Earth-Simulator [top500.org]

    Sun Fire has 10,368 processors with a Rmax(GFlops) of 38,180.
    Earth-Simulator
  • From the article: Meteorologists use supercomputers to predict climate patterns decades into the future by analyzing huge databases of statistics.

    It all makes sense now. When they predict 90% chance of rain three days in a row and we don't see a drop, they relly meant that it will rain sometime between now and thirty or forty years from now.
    • Measuring in yards/meters is easier than measuring in nanometers.
      Predicting long term weather trends is easier than daily weather conditions in your area.

      When fluid dynamics and computers are to a level to handle compressible fluids at the scale needed, the predictions will still be off to places that aren't the focus. Frequently the predictions for my city only come true to part of the city.
  • From the article it sounds like the whole thing is based on a large collection of specialised processors designed only for protien folding calculations, so while it may be able to do those at a petaflop rate it probably can't do anything else at nearly that rate (just as the WWII Colossus computer could beat a 486 at Enigma cracking it certainly wasn't faster terms of actual computing speed)
    • The MDGRAPE-3 VPUs are optimized for moldyn, but could be easily applied to a wide range of problems. They do have a robust set of floating-point instructions. I have experience in adapting a wide range of scientific problems to similar architectures (in a previous generation of capability), and can assure you that while peak hardware utilization rates would be nigh impossible to achieve for the majority of applications, a respectable percentage of the theoretical capacity could be brought to bear on a wi
  • 9 million, sign me up, where I can get one.
  • glxgears (Score:2, Funny)

    Japan has managed to create the first Petaflop supercomputer, called MDGrape-3, with just 4808 chips, and it cost just $9 million to develop.
    Wow! I bet it gets loads of fps in glxgears!
  • Not even close! (Score:2, Insightful)

    by bockelboy ( 824282 )
    You've all been had by a reporter with an overactive imagination talking to a researcher selling his own shit. The MDGrape is a specialized processor (you can actually buy it commercially as a separate board for your computer) that does exactly one thing: particle simulation using traditional laws of physics. This will allow it to do computational molecular dynamics on the small scale or universe modeling on the large scale. All it understands is data input in the form of particle positions and will outp
    • This whole article is like comparing the rendering capabilities of your new Nvidia GPU and the latest AMD CPU, then concluding AMD is full of idiots who can't engineer because the Nvidia chip renders more polygons.

      ...and now we know the real reason AMD decided to buy ATI! :-)

    • particle simulation using traditional laws of physics. This will allow it to do computational molecular dynamics on the small scale or universe modeling on the large scale.

      Hmm - this is interesting in and of itself. What I mean by this, is that here is a very specialize (and I assume, Turing complete) computer, doing one particular job, and doing it amazingly well. Now, let us suppose the simulation of particles it does according to known physics is complete (I know it isn't). If it were, then in theory it

  • Of all the MD 20/20 varieties...grape stands out as the best.
  • I guess it would depend on the definition, whether it has to be capable of general purpose or only specialized. Technically, it should be possible to easily get petaflop performance by putting a few million into a computer using chips designed only to run LINPACK.

    Personally, I don't think it should qualify. Otherwise the EFF's $250,000 Deep Crack, which could only crack DES (although faster than tens of thousands of regular computers at that time), would qualify too.
  • How many petaFLOPS will IBM get out of a new Blue Gene made from Cell processors?
  • Not comparable (Score:3, Informative)

    by News for nerds ( 448130 ) on Sunday July 30, 2006 @12:09PM (#15811532) Homepage
    Though the theoretical performance of this computer is higher than that of BlueGene and may have higher realworld performance too, you can't compare this supercomputer with BlueGene and other TOP500 supercomputers since it can't run LINPACK. It's just too specialized for its use.
  • If you can just take their n^3 algorithm (with quantum it's more like n^8), and make it n^2, you can do all that on your desktop :)

    Not all progress needs to be brute force. But brute force is much more fun to brag about.

    -
  • Already the article suggests it may not be capable of running linpack, the other question being, are these 32-bit precision operations or 64-bit precision? Linpack explicitly measures 64-bit precsion. This is one reason why despite some clustered deployments that are inevitible with the cell processor, those won't be impressive top500 wise despite the cries of 'OMFG, cell has uber gigaflops'. Cell brags on the gigaflops, but the state of Cell as it is announced today is only interesting 32-bit precision w
  • I compiled some quick facts which compare those three supercomputers and added pointers to other resources for your convenience:
    http://www.bloglines.com/blog/ITnomad?id=126 [bloglines.com]

    Cheers, Alex.
  • Idiotic summery. (Score:3, Informative)

    by imsabbel ( 611519 ) on Sunday July 30, 2006 @03:39PM (#15812698)
    This computer, like all the previous (md)grape generations, is a central force potential calculation accelerator.

    it does nothing but calculate 1/sqrt(dx^2+dy^2+dz^2)*variable, but really really often.

    Grape 6, 5 years or so ago, was already running at 200Mhz, had a throughput of one force calculation per pipleline and 6 pipelines on once chip. So it counts as 1.2 billion force calculations, each being (1* inverse, 1 sqrt, 3 adds, 3 squares, 2 fmul, ect).
    A lot of flops, but totally useless as general purpose computers.

  • For the first time, I have become worried about an unbalanced singularity. If one country reaches the singularity first, the power they would gain might allow them to prevent a singularity in other countries. The US should invest in technology to speed and guide the development of singularity technology here at home. We can't afford to let the singularity happen somewhere else first.
    • by DeXOR ( 936122 )
      Yes, we must avoid a singularity gap!
    • For the first time, I have become worried about an unbalanced singularity. If one country reaches the singularity first, the power they would gain might allow them to prevent a singularity in other countries. The US should invest in technology to speed and guide the development of singularity technology here at home. We can't afford to let the singularity happen somewhere else first.

      What do you mean? That some super-intelligent AI created somewhere in the world, would want to have an active part in human

    • ...what makes you think that we (here on earth) will be the first (in the Universe)? How would you (or any one of us) know whether this hasn't already occurred (ie, a technological singularity has already happenned elsewhere in the Universe, and we have been "isolated" in some manner to prevent us from doing the same, or at least limiting our spread should we get lucky).


      The honest answer is "we don't know", and that we should continue on (for whatever that means) doing what we do...

  • The article is badly written. It cost Riken $9m, because NEC (as SGI Japan) paid for most of the hardware, and because Hitachi and Intel provided all but three of the workers.

    In short, Riken had almost nothing to do with the process, except for the design of the single custom chip involved, and even then, most of the work was done by outside firms who wanted the press. And even then, it still cost the host organization $9 million!

It is easier to write an incorrect program than understand a correct one.

Working...