Forgot your password?
typodupeerror
Hardware

FPGA Supercomputers 237

Posted by michael
from the drool dept.
olafva writes: "You may be interested in this new breakthrough! See NASA Press Release and a couple of today's local stories for a remarkable paradigm shift in "Computing Faster without CPUs"." CmdrTaco said he'd believe it when he saw it. Well, they've got pictures. (Update: 03/29 5:02 PM by michael : At NASA's request, we've modified the links in the above story to reduce the load on their Public Affairs website. The same content is at the new links.)
This discussion has been archived. No new comments can be posted.

FPGA Supercomputers

Comments Filter:
  • by Anonymous Coward
    Everyone who does research with FPGAs groans when they hear something from Star Bridge. These guys mislead the public to get VC$.

    If you're interested in FPGAs look for real research. Anyone can put together a board with a few hundred FPGAs. The real research is in designing a way of really using the chips.

    SB's original benchmark that gave them the claim for "supercomputing performance" was "How many 1-bit adders can I stick in a couple of hundred FPGAs?. Bogus.
    Fraud..
  • by Anonymous Coward on Thursday March 29, 2001 @05:05AM (#330970)
    Hey - spend some time at Tom's Hardware Page!

    An FPGA is a combination hardware/software device. If you passed that Digital Circuit Design class back in college, you remember that you can implement a 20-bit divider using - what - 84 NOR gates or something like that? There are orders of magnitude more gates in these devices, and orders of magnitude more complicated tasks can be accomplished.

    You write a 'program' as a collection of declarative statements from the "Predicate Calculus" around the internal structure of input and output pins, and the FGPG compiler figures out which "gates" to "program" in the "field".

    As the number of gates, intermediate terms, inputs, and outputs has grown, so has the complexity of the expressions, thus programs, that these puppies can handle.

  • by Anonymous Coward on Thursday March 29, 2001 @05:15AM (#330971)

    There are a lot of groups working on similar stuff:

    http://www.ccm.ece.vt.edu/acs_api [vt.edu] - This is my group and I apologize for the lame web page.
    http://splish.ee.byu.edu [byu.edu] These guys do very good work, especially when it comes to hardware description languages.
    http://www.east.isi.edu/projects/SLAAC/ [isi.edu] We like these people too.
    http://www.annapmicro.com [annapmicro.com] A lot of our graduates go here.

    There are several more groups - you can find a more complete list on the People section of ISI's web site.

  • by Anonymous Coward on Thursday March 29, 2001 @06:55AM (#330972)
    This FPGA-reprogramming trick came up about a year or so ago. If I recall correctly, the company involved has as its star actors, a couple of high-power intellectual property attorneys, and a huckster of sorts that has a patent on a reprogrammable audio mixer console, using similar technology. The company at the time was begging old, cast-off FPGAs from Altera, et al, so they could 'prototype' a few boards covered with hundreds of FPGAs, so they could show these around to various companies, and raise some venture capital. The only information available on their web site was a pitch aimed at folks with more money than wits, to invest in their company.

    The important things to note:

    1) Even though you can reprogram an FPGA in about a millisecond, the logistics of getting all the right programs to all the right FPGAs on a very dense board is left as an exercise to the reader (hint -- it is not a simple walk in the park).

    2) Even though you can reprogram an FGPA in about a millisecond (yielding the claimed 1000 times a second machine re-configuration), it takes many minutes (sometimes hours) for the typical VHDL or similar program to produce the code that you will want to download to those FPGAs. And, of course, if you want disimmilar loads for various groupings of those chips, you will need to repeat the above with feeling, over and over, and over.

    3) This particular company was crowing about their patented graphical programming language last year, and also didn't have anything real to show. In other words, no one had actually seen them push buttons, and have this magical language actually produce runnable code for all those FPGA's to do anything useful.

    As near as I can tell, this whole thing is based on some guy's idea of raising money so he can drive fast cars, etc, etc. What really hurts is seeing NASA geeting sucked into this black hole...

  • *low conspiratorial tone*

    Or allow the *real* positives through...
  • they have no qualms about using quicksort for simple arrays of primitive types.

    Stability is obviously not an issue in those cases.

    --

  • Hint: strings nasapressrel.doc | less

    -cbd.

  • it takes many minutes (sometimes hours) for the typical VHDL or similar program to produce the code that you will want to download to those FPGAs

    It takes many minutes (sometimes hours) for my compiler to build a medium or large project. But I don't store the source code on my computer to run, I store the object code, so I don't care how long the compiler takes to produce it.

    I've never used an FPGA; would it not be possible to do the same thing for them? Compile a program once into "FPGA code" which then gets stored as the executable file to be sent to the chip when invoked?
  • Will it run DOS?
  • Depends upon what you mean by "outperform". A reconfigurable computer made of FPGA's, at least in theory, can outperform custom hardware when it comes to meeting the instantaneous needs of a task.

    Kythe
    (Remove "x"'s from
  • > How could you else have learned
    > Linux 10 years ago :)

    Simple.
    Linux Kernel 0.01 - Released 01 Aug 1991
    Linux Kernel 0.02 - Released 05 Oct 1991

    I was there too....

    Stephen L. Palmer

    ---
  • > Also,this does give rise to the idea
    > that it is time to start the 10 year
    > celebration planning

    Most definately! Where's the party!?!?!

    Stephen L. Palmer
    ---
  • by jjr (6873) on Thursday March 29, 2001 @05:12AM (#330981) Homepage
    NewsRelease
    National Aeronautics and
    Space Administration
    Langley Research Center
    Hampton, Virginia 23681-2199

    Bill Uher
    NASA Langley Research Center, Hampton, Va.
    (757) 864-3189

    For Release: March 26, 2001
    For those you can read the Word Document

    RELEASE NO. 01-021

    NASA Langley to test New Hyper Computer System
    Computing Faster Than Engineers Can Think

    NASA Langley engineers are exploring new tools and techniques that may move them and the projects they develop beyond the serial world into a parallel universe.

    Via a Space Act Agreement, NASA Langley Research Center will receive a HAL (Hyper Algorithmic Logic)-15 Hypercomputer from Star Bridge Systems, Inc. of Midvale, Utah. The system is said to be faster and more versatile than any supercomputer on the market and will change the way we think about computational methods.

    Taking up no more space than a standard desktop computer and using no more electrical current than an hair drier, the HAL-15 is the first of a new breed of high performance computer that replaces the traditional central processing units with faster Field Programmable Gate Arrays (FPGAs). These are specialty chips on a circuit board that can reconfigure themselves hundreds or thousands of times a second. This makes it possible for multiple applications to run at the same time on the same chips making them 1000 times faster than traditional commercial CPUs. This maximizes the use of millions of transistors (gates) on each silicon array. Traditional processors, because of their general purpose design, are wasteful, since for most applications they use only a small fraction of their silicon at any time.

    HAL is programmed graphically using the company?s proprietary programming language, VIVA. This language facilitates rapid custom software development by the system?s users. Besides NASA Langley, other users will include the San Diego Supercomputer Center, Department of Defense, Hollywood film industry and the telecommunications industry.

    -more-

    NASA Langley is among the first in the world to get ?hands on? experience with the new system. It will be implemented to explore:
    -Solutions for structural, electromagnetic and fluid analysis
    -Radiation analysis for astronaut safety
    -Atmospheric science analysis
    -Digital signal processing
    -Pattern recognition
    -Acoustic analysis

    Media Briefing: A media briefing will be held at 9 a.m., Tuesday, March 27, at the Pearl Young Theater Newsroom, Bldg. 1202, 5 North Dryden Street at NASA Langley Research Center. There will be a news briefing and short demonstration at 9 am followed by a demonstration and discussion for scientists and engineers. HAL developer Kent Gilson and Star Bridge Systems, Inc. CEO Brent Ward will conduct the demonstration. Two Langley researchers, Dr. Robert Singletarry and Dr. Olaf Storaasli, trained on the new system and will report on their first-hand experiences with the hypercomputer.

    -end-

  • ... on slashdot? Not that I don't welcome more news on the subject, but I remember the story on StarBridge, which was greeted with nearly universal skepticism (I among the skeptics, I'll admit). Wonder how they're doing with investors now?
    --
  • by battjt (9342)
    I can't find a "FPGAs for Idiots". Can you recommend a site for me to get started?

    I've talk talked about this stuff at a highy conceptual level for years and have a very strong CS background, but I keep getting lost in the marketing literature.

    Thanks,
    Joe
  • by Lando (9348)
    Hmmm, So NASA is looking at a HAL computer system,,, Anyone know what Dave thinks?
  • Yeah, that's exactly what springs to my mind when I try to come up with uses for a supercomputer the size of a PC. To run my coffee pot.

    Finally I can actually make coffee at home; I've always wondered how they ran the coffee pot at 7-11 - where I buy all my coffee - but now I know: They use a supercomputer!

    This also explains why Starbucks coffee is so expensive... they've been using these "hypercomputers" in a secret back room at each store.

  • A media briefing will be held at 9 a.m., Tuesday, March 27, at the Pearl Young Theater Newsroom, Bldg. 12

    Geez, I coulda gone to see this in person.

    Offtopic Msft bash seen on 3COM:

    "The performance of the server connection depends heavily on the network operating system and underlying protocols. UNIX operating systems appear better adapted to handling Gigabit Ethernet speeds, while the TCP/IP protocol running under Microsoft NT 4.0 still has much room for improvement. TCP/IP is a connection-oriented and complex protocol that requires high CPU bandwidth to process packets at gigabit per second rates. "

  • for typical pc use it would be to expensive and painstaking to program

    Bullshit. No more expensive and painstaking than it was to make a pentium processor and a Windows operating system. Christ but both of those architectures are nightmares of complexity, and yet they still got built.

    No, the real problem is that it's a wholesale change in the way of thinking about solutions and applications, and we don't have enough engineers and programmers trained to think that way.

    Yet.

  • There's no such thing as NT on the Alpha chip. Even if there was, I'm not sure I'd want to screw up the VMS machines over here.
  • I am 36. I turn 37 in April. Birthdays don't necessarilly match year-end dates or OS anniversaries precisely. To clarify, I began playing with Linux early in 1992.
  • ...and learning to write screenplays at 36 (are they going to suck as much as The Matrix?) doesn't dis-prove that you can learn easier when you are younger...

    Point well taken. I wasn't trying to disprove anything, merely cite by example that learning a new skill as one grows older isn't a problem at all. As for it being easier to learn when young, that is true of some things (languages) at particularly early years (before six or eight years of age being the typical ages cited), but is certainly unproven for anything beyond that. For example, IFAIK it is unproven that learning German at 21 is easier than learning German would be at 31 or 41.

    To answer your question, it is quite likely that my screenplays will suck far worse than the Matrix. :-)
  • by FreeUser (11483) on Thursday March 29, 2001 @06:23AM (#330991)
    Somebody with years of experience in traditional programming probably won't find their skills translate too easily. The investment in layers of abstraction built on traditional processors is too big ever to throw away, but this kind of a machine is a nifty trick to have available.

    It is extremely cool to have this technology emerging. As for our years of skills translating, or not, it isn't really all the important. We will simply learn how to program this new equipment, from scratch if necessary.

    It is a myth that the young learn better than the less-young. As an example, I learned German at 21 (and am now very fluent), Linux at 26, how to fly a plane at 33, and am now learning to write screenplays at 36. (As an amusing counterpoint I will almost certainly never learn to spell, even at 60. Not because I cannot, but because I have better things to do with my time, and a spell checker when absolutely necessary, but most of all, because I take perverse pleasure in yanking the grammar nazis' chains). While I doubt I'll be performing any airshows, or attending the Oscars, anytime soon, the point remains: we have already been taught how to think and learn. Learning how to use and program FPGAs won't be that big of a problem, with or without years of programming experience behind us.
  • by Delphis (11548)
    And jeez.. I know this might be a foreign idea to a lot of people, but THERE'S MORE TO LIFE THAN MONEY!

    I know you're just trolling, but why is everything always money money money.

    --
    Delphis
  • Oh my fsking god! You're right. I don't know how I could have missed that.

    You're reading comprehension skills must be amazing. Please tell me -- what is your IQ?

    d
  • It is a myth that the young learn better than the less-young.

    I too have learned stuff as I've gotten older, but that wasn't what I meant. The history of computer science since the 40s and 50s really hasn't been as shallow as people like to think. Somebody else's comment to the effect that "things of an essentially linguistic nature can all be learned in 21 days" is short-sighted. Go to any computer science library in any university. Look at the shelves and shelves of books, journals, and papers. Stuff on compiler design, cache performance and optimization, several hundred decades-long debates percolating under the general heading of language design, relational databases and object databases and which is better...

    It's easy to forget that five decades of very smart people have dedicated their careers to advancing this whole "computer science" thing. In our current historical situation, the entire field has been flattened down to "what can I do with web browsers and servers?" in the popular mind. People start to believe that something like J2EE represents all of human thought regarding computer science, or at least, all of it that's worth preserving.

  • by WillWare (11935) on Thursday March 29, 2001 @06:04AM (#330995) Homepage Journal
    Programming a bunch of FPGAs (essentially an ocean of gates and flipflops) is necessarily pretty different from programming a general purpose sequential computer. It's interesting to see Star Bridge's thoughts [starbridgesystems.com] on this, and why they're optimistic about this approach.
    The VIVA project was initiated several years ago to bring high-level computer language capability to FPGA programming and to take advantage of the massively parallel capabilities of FPGAs. FPGAs are cheap to make, much cheaper than complex microprocessors such as the Intel Pentium III. The yield rate is higher because the deposition densities are much more uniform for FPGAs than for microprocessors. Furthermore, the entire chip surface can be dedicated to usable transistors, with the potential to provide orders of magnitude more computing capability on the same size chip.
    They go on to describe a hierarchical GUI that connects functional block to make bigger functional blocks. Somebody with years of experience in traditional programming probably won't find their skills translate too easily. The investment in layers of abstraction built on traditional processors is too big ever to throw away, but this kind of a machine is a nifty trick to have available.

  • Wow, sound's like it could be usefull for CERTAIN things, but still amazing nonetheless. I always hear of this amazing new technology coming out, FPGA Supercomputers, solid state hard drives, REAL 3D monitor's that cost $5 to make from existing LCD displays, emulated gills to breathe under water, etc.

    I just wish some of these things could make it to my house. Is it because of the ridicilous marketing and business planning that these inventions depend on to succeed, or is it just because they don't want to market these ideas and sell them to dead end companies?

    I'm not totally sure, but i'd like to know whats stopping some of these things from making it to the end user.

  • by Erik Hensema (12898) on Thursday March 29, 2001 @05:05AM (#330997) Homepage

    I couldn't read the press release (MS Word - bah), but judging from the websites, the FPGA is dynamically programmed to perform very specific tasks in hardware.

    Since these specific tasks can run in hardware, they will run 1000 times faster than a Pentium. There is no way in the world this machine is going to run general purpose applications at this speed. Only very specific, small, algorithms. Sorry, no 6000 fps for Quake ;-)

    This makes the machine useless for everyday use in your home. However, I agree this machine may be very usefull for flight-control computers.

  • From the Daily Press article: "It looks like any other computer case (the rectangular part of a PC that contains all the chips and wiring to run it)".

    But I looked at the pictures and that was simply not the case! The case being, it didn't look like a case. Uuuhh, should I be writing this in upper case?

    Aargh, that damn coffee. How fast will it compile my kernel?
  • Well, now when you comment on it, isn't that reversed? Or do you just lose all you points?
  • by shaka (13165) on Thursday March 29, 2001 @05:06AM (#331000)
    From the Daily Press coverage: "People could hook into central hypercomputers to run their entire households -- from the coffee pot to the television set, the shower to the garage door"

    Yeah, that's exactly what springs to my mind when I try to come up with uses for a supercomputer the size of a PC. To run my coffee pot.
    Finally I can actually make coffee at home; I've always wondered how they ran the coffee pot at 7-11 - where I buy all my coffee - but now I know: They use a supercomputer!
  • I belive that viva is latin for life....as in vivasect....I could be wrong though
  • HAL said it too near the end of the same movie.

  • by Taurine (15678) on Thursday March 29, 2001 @06:05AM (#331005)
    One would have thought that the natural format to choose for a press release on a web site would be HTML, just like the rest of the web site it is hosted on. That way, 100% of the world's Internet users, who are the only ones that will be able to retrieve the file, will be able to read it, regardless of the operating systems and user software they choose to install and/or pay for.

    Further, most of the 95% of the World that you believe use MS Word are not the people that will have any interest in reading about this. The people who are interested are mainly scientists and engineers, two groups who tend to be more likely than average to use a platform other than a PC running some version of Windows. These guys are more likely to write things in LaTeX than Word. But they will have an equal chance with everyone else of being able to read HTML.

    I certainly don't have any software installed on my system that can read Word files. I know of several programs that could do an aproximate conversion, but why should I install extra software, using my time and computing resources, to read this, when its not even close to the format that any reasonable person would have expected it to be in anyway?
  • Damn it. Don't post comments first thing in the morning without the proper ammount of coffee!

    Bryan R.
  • by BRock97 (17460) on Thursday March 29, 2001 @06:09AM (#331007) Homepage
    So, NASA makes this announcement, and Sony goes right around and announces that they have been working closely with NASA to develop the Playstation 5 based on this technology. The PS5 which begat the Playstation 4 developed by the NSA which begat the Playstation 3 developed by IBM's super computer division will allow the game player to control the console from any NASA station in the world! Imagine playing Tekken Tag Tournament Hyper Z 2K10 Script Kiddie Edition with the folks on the International Space Station! From all that I have read, I think I will have to wait for the PS5 instead of the PS4 and PS3. Thanks Sony marketing engine!!!!!!!

    Bryan R.
  • by toofast (20646)
    Enlighten my lack of knowledge... but what is a field programmable gate array (FPGA)? Is it another weird acronym like a Global Regular Expression Parser (grep) or Packet Internet Groper (PING)?

  • ... and pictures, too.

    HAL-15, desktop model (the one NASA is testing)
    http://www.starbridgesystems.com/prod-hal1.html

    HA-300, the rack-mounted, 12.8 TeraOp version
    http://www.starbridgesystems.com/prod-hal3.html

    The Star Bridge website seems strangely non-Slashdotted, considering how much trouble I had getting the NASA sites to load.

    When I saw this one, I was sure it had to be an early April Fool's joke, but it looks like they're for real. The company's hype still sounds pretty pie in the sky, but if they can deliver even 10% of what they're promising, a hell of a lot of computational power could be available in a few years.

    They cite cost savings in chip design (simpler, lower power, etc.) and chipfab retooling as a point in their favor (a single type of chip, customized for different applications). They cite it for speed of implementation, rather than reduced cost, but presumably that would come later. The HAL-300 is priced somewhere around $26 million, so don't bother to check E.bay for a few months yet.
  • These systems have already been available for quite some time from Star Bridge Systems [starbridgesystems.com] and have already been featured on /. a few times (search for `Star Bridge Systems'). But they're still reaaaaally cool. There's a lot of information about them on the Star Bridge Systems website (see link above).

    I think the only real reason that NASA is going to be `one of the first', is simply the fact that nobody seems to buy these things. Which is a pity. What's really REALLY sad, is that their claim to have a $1000 version available by now (link to /. article [slashdot.org]) is still vaporware.

  • If this thing could be reconfigured to be a better hardware graphics accelerator than a dedicated hardware graphics accelerator like the GeForce 3, then I'd REALLY be impressed :-)

    75 GFLOPS for the GeForce 3 - kinda hard to beat.
  • Yep, shuffle sort can be implemented in O(n) on a massively parallel computer.

    What'd be neat would be if they sold this thing at a price reflecting it's cost (an FPGA chip) rather than the customers ability to pay...then we could all play with them.
  • by SpinyNorman (33776) on Thursday March 29, 2001 @08:54AM (#331026)
    Of course, but this is actually easy to do. I remember taking a VLSI design course (based on the Carver Mead/Lynn Conway book) back in college around 1980 and designing a "memory cell" with a built-in comparator that could swap the contents with the neighboring cell.... the "sort algorithm" then consists of loading the memory and clocking it N times! :-)

  • I rememeber an article in Discover magazine (online here [ias.edu]) that talked about a "Star Machine." It was called "GRAPE," it was for the study of globular clusters, and one of it's iterations was the first teraflop system ever built.

    It was used for calculating the gravitational interaction of thousands of bodies -- a very parallel and complex problem. The solution was many custom processors in parallel, and it was so successful (and cheap!) that it outperformed multi-million dollar supercomputers at a fraction of the cost.

    The downside was that it was a single-use system -- it could only to the calculation it was hard-wired to do.

    Since the site is slammed, I can't see what they're actually doing... but the name is sure close. The FPGA idea is neat, because it would relieve the single-use limitation.

    I'm still not holding my breath waiting for one of these to appear under my desk, though...

  • "There's no such thing as NT on the Alpha chip."

    Actually, there is [microsoft.com].
  • by p3d0 (42270)
    You can think of an FPGA as a digital circuit simulator. You can design any digital circuit, and an FPGA can simulate it roughly one or two orders of magnitude slower than the circuit would run if you made a real IC out of it.

    Logic operations can be described with truth tables. FPGAs contain programmable truth tables (called lookup tables, or LUTs), so you can implement whatever logic operation you want. They also contain programmable interconnects that allow you to join your LUTs in any way you want.

    Usually, they also contain some memory, because it takes a lot of LUTs and interconnects to build memory, and the resulting memory would be very slow and wasteful.

    How is this faster than a CPU? Well, the win comes when you design a custom circuit to perform a certain task, rather than using a general-purpose CPU. For instance, if you could make a citcuit to do something at 100MHz when it would take, say, 100 Pentium instructions, then your FPGA would outperform a 10GHz Pentium!

    Used in this way, FPGAs are the ultimate parallel computer. They have many thousands of very small processing units (LUTs).
    --
    Patrick Doyle
  • FPGAs can't outperform custom hardware. They can outperform CPUs because CPUs are general-purpose hardware that run programs in serial, while FPGAs are general-purpose hardware that run programs in parallel. But special-purpose hardware will always win. (Whatever technology you make the FPGA out of, you could just make the custom hardware on the same process and get an order of magnitude improvement.)
    --
    Patrick Doyle
  • No, nothing short of an infinitely-parallel machine can do O(n) for sorting. Remember, the big-O notation refers to asymptotic complexity, which means the problem size increases without bound. If your computer is not infinitely parallel, then there will exist some n which is too large to fit.
    --
    Patrick Doyle
  • by wiredog (43288)
    From the press release: HAL is programmed...

    HAL, yeah, right, "Open the goatsex link HAL" "I'm sorry Dave, you know I can't do that"

    And we're still 2 days from 01-04-01.

  • Starlab and grape are not commercial systems, but instead are a collaborative effort between many universities, each building and tweaking on the design. At the physics dept here at Drexel University, we have a several year old (GRAPE-4, I think?) system that can outperform our 64 Node beowulf cluster for the calculation it was instended for. Stick that in yer pipe and smoke it.
  • One area where these babies ought to shine is in modern ciphers systems. DES, twofish, as DSPs, maybe even as CODECs (tho that might be a bit much for the gatebudget)... anything where you can partially evaluate the program into hardware. This would work very well on a daughter card, where you can set up DMA controllers to just pump data in and out of the FPGA.

    The hard part is designing the circuit. Compilation down to silicon is a known hard job, with layout and drawing abstraction boundaries being two main stumbling blocks.

    blue sky musing: On the horizon for mainstream acceptance are profiling feedback optimisers, which produce specialised versions of code that run very fast for a limited set of [common] inputs. These currently go from a higher level language to a lower level language (java JITs like HotSpot or transmeta's codemorphing) or from the lowlevel language to the same lowlevel language (HPs dynamo).

    It would be really cool to see this technology applied to creating FPGAs, where the meta software notices that a certain basic block is taken often, and has mainly bittwidling operations. If it is taken often enough, and is long enough (this is where the specialisation of dynamo comes in -- it basically just creates optimised long basic blocks) it makes perfect sense to compile it to silicon.

    Eventually, the Ghz race WILL pewter out, and we'll be forced to this sort of generalised specialisation for getting the 90/10 any faster.

  • neato.

    I wonder; would it be more useful to market these as reprogrammable CPUs? Ie, don't make the poor hardware designer design the whole CPU, but give them a few instructions that you'll take care of the decoding and commit-in-order and speculation, but they get to design the actual instruction.

    Outlinish they'd declare: this instruction reads registers x,y,z, writes a,b,c, and will require so many cycles to complete after inputs.

    Has this been tried and failed, is this what they do, or are there other reasons why It Would Never Work?
  • An extremely important thing to remember about all programmable logic devices is that a tremendous amount of the die of a programmable logic device is wasted compared to an equivalent ASIC. Programmables require massic chip real estate for routing of signals. The routing structure is, arguably, the most important aspect of the device. You can put in all the whiz-bang specialized circuits you want, but you have to be able to use them (and, use more than one of them if they're tiled throughout the device). Routing is what enables that.

    Also, the logic primitives in a programmable (e.g. a slice in a Xilxin Virtex FPGA) can run extremely fast. It is not the limiting factor in getting speed from an FPGA. The much bigger issue is routing.

    In addition to the actual logic and routing, there's configuration bits (the SRAM/FLASH/antifuse bit that are used to actually cause the device to implement the logic you want) and the support logic to program the configuration bits. There are millions of configuration bits on larger FPGAs. And don't forget the fact that the IO cells in most FPGAs support multiple I/O standards and usually contain a flip-flop and a small amount of miscellaneous stuff (e.g. a couple muxes for the output enable and clock select).

    On the software side, generating logic equations is well known. The issue is in taking advantage of the specific architecture of the targeted device and all it's special features. And the other issue is finding the optimal routing between the logic resources and memories you've used. Both of these issues have been and continue to be researched.
  • I seem to recall that there was development on an application of AI to use it to find better ways to design logic gates & the such. They put it to the test and found it optimized a bunch of circuits that were hand-designed (something like 300 gates to 75 gates).

    I guess what I'm getting at is that yeah, a programmer could design & layout the chip according to his needs, but wouldn't it be better to describe the chip (ala C-Program), and run it through another system that would program your chip most efficiently?

  • Logic gate programming is slightly more complex than syntactic abstraction. I know a good number of computer engineers that gave up and went back to traditional computer science (who's programming skills can apparently be learned "in 21 days").

    Having gone completely through the process myself, it's as easy as skiing for me, so I can't objectively analyize it.

    The biggest problem is in debugging; you have to trace through dozens, hundreds or thousands of "signals" on a simulator. Logging is also not always an option.

    -Michael
  • Well, nand isn't the fastest (active buffers, cheater-switches, and inverts fill that role). Plus most logic-design I've encountered provide macro-cells that perform efficient and often used functions. The one's I've worked with (in class) utilized a two-stage multi-AND / single-OR cell so as to provide generic combinational logic. And then they provide macro-cells such as Multiplexors, memory cells, or what-have-you.

    The details are exploited or emulated by the synthesizer stage (if memory serves). Thus you can abstractly program with VHDL or what-have-you and not worry too much about what's really happening. I'm curious to learn what 'VIVA' adds to the development environment. Maybe it's Visual VHDL (tm) with drag and drop widgets. :)

    -Michael
  • I've played with this idea since college back in the 80s... if you have a massively parallel grid of single bit computing cells with a delay of a single clock on ALL operations, almost all timing problems can be resolved and race conditions are only a matter of looking for circular references. If you use a graphical editor to lay out your data flows, you can program the thing in a fairly simple manner. You can get phenominal flow rates for data, searching for a data string could happen at the maximum transfer rate of the hard drive, for example. Doing pipelined operations at 100Mhz (a very conservative clock rate) could allow for all of the feature recognition that the human eye does in hardware, in real time, with more precision. The possiblities that fall out of abandoning the Von Neuman architecture are so varied and vast, it's like trying to describe what's possible when you switch from animal to steam power.

    I want to see a grid of 1000x1000 single bit clocked cells that can be reprogrammed on the fly... I'll pay up to US$300 for one to play with, provided it does the clocking as I specified above. At a bare minimum I could do FFTS in real time on a 100Mhz 12 bit data stream with it.

    --Mike--

  • Think of each cell in the array as a few bits of a fairly large SRAM block. The only difference is that the memory stored is actually the program for a 1,000,000 bit cpu. Data has to be piped in via the edges, I would probably bond out 32 bits from each corner to make a reasonable package size.

    As far as "storage" (RAM) in the traditional sense, there is none... just the states of the individual bit computers. Taken in combination, you can program anything from a pipeline multiplier through string comparison, etc. Pipe the data in one corner, and out the other, using DMA to feed it from the main system bus.

    I hope that all makes sense... I'm tired, and need my bandwidth fix, thanks to NorthPoint's demise.

    --Mike--

  • Um... that Word file tried to change my normal.dot template. Did anyone else encounter this? Is NASA spreading infected Word files?
  • "Within 10 years, we should use this machine in various places, handling various problems," Singleterry said.

  • by stevens (84346) on Thursday March 29, 2001 @05:40AM (#331060) Homepage
    Well I would have taken a look at the press release... if it wasn't in fscking MS Word format. Sigh.

    Abiword [abisource.com] runs on just about any platform you can use on a PC and reads MS Word files pretty well. It reads this press release just fine.

    Steve
  • Starbridge systems named it hal, nasa bought it from them.
  • This story is slightly misleading in it's references to NASA. The computer system is the FPGA system designed by Starbridge Systems. There were were previous mentions of the HAL system in the following articles:

    What Happened To Starbridge's Supercomputer [slashdot.org]

    Reconfigurable Supercomputers [slashdot.org]

  • by Savant (85811) on Thursday March 29, 2001 @06:55AM (#331066)
    I'm a programmer at Xilinx working on an internal tool our IP developers use, and I have to say that that's not how FPGAs work. The boards have flipflops and LUTs (Look Up Tables) in a regular matrix; the LUTs hold 16 values and act essentially as truth tables indexed by 4 inputs. Hence they can imitate any gate with the same number of inputs, be it XOR or NAND or any other gate (or even some combination of 2-input gates which has 4 inputs and a single output). This is, of course, a very simplified explanation, but the principle is the same even with the more advanced FPGAs.

    'Gates' figures on FPGAs are thus rough estimates of how many NAND gates would be needed to provide similar functionality.

    Savant
  • Sorry, no 6000fps for Quake ;-)

    Why not? The algorithms may work best for small-grain problems, but what is any graphics program but something that computes thousands of pixels at the same time? I'd imagine image-processing (in general) is highly parallelisable at the pixel level.

  • by haystor (102186) on Thursday March 29, 2001 @07:33AM (#331077)
    Naw, if NASA really wanted to screw with the SETI@Home crowd they could plant some false positives.
  • Hehe..

    Suddenly Quicksort is not the best sort algorythm, and the traveling salesman becomes possible to solve!

    Even though we touched on hypercomputing at university, some of the basic premises I have, and rule-of-thumb knowledge I have will be outdated.

    I have to learn anew to program using logic, and logic blocks, at least I'll get back to my scientific (mathemetic) roots!

    Whee...

    For once Computer Science may actually become more of a Science!

  • Hehe... I realise this.

    It does get easier... this machine's design gives it an order of improvement, and not just N times faster. This is, sadly, still below the factorial scale of most NP-complete problems.

    Just making a point that what we have been taught may be nullified by advances in technology. Things like quantum computers may however approach that computing capacity, and I see this machine as a step in the right direction.
  • by coyul (119455) on Thursday March 29, 2001 @07:21AM (#331084)
    They go on to describe a hierarchical GUI that connects functional block to make bigger functional blocks. Somebody with years of experience in traditional programming probably won't find their skills translate too easily.
    In my digital logic class in university we had FPGA boards from Altera in the lab. To program them, you defined your components in VHDL, then connected them in a GUI that resembled that of any other visual object-oriented IDE (which I admittedly don't use). If you want the output of one component to feed into the input of another component, you just draw a line between them. This is not difficult. From this GUI you can easily pull up the VHDL description of any component and edit it if you need to. Reading the 'Programming VIVA' section on Star Bridge's homepage [starbridgesystems.com], they're environment is remarkably similar. Trust me: if some of the folks in my class could make things work in this kind of environment, no programmer worth the name should have any difficulty adapting...
  • They're calling it HAL. Mommy.
  • "Make me a cup of coffee."

    "I'm afraid I can't do that, Dave."

  • I wonder if you could make a specialized machine, with a bunch of FPGAs, solely for the purpose of AI for massive scale online games. Most MMORPGs have famously stupid AI because making smart creature AI takes both lots of cycles and very good code. Could a specialized box designed for these computations be a salable device?

  • Suddenly Quicksort is not the best sort algorythm, and the traveling salesman becomes possible to solve!
    No known exact algorithm for traveling salesman is polynomial, and FPGA does not change that. For a large enough number of cities, the problem is still intractable. And that number of cities (N) is not very large, since the number of possible routes is on the order of N factorial.
  • Yep, shuffle sort can be implemented in O(n) on a massively parallel computer.
    Only if you assume the size of the machine increases with the size of the collection to be sorted.
  • Does anyone else find it slightly disturbing that NASA would name a computer "HAL"? 2001 anyone?


    http://www.bootyproject.org [bootyproject.org]
  • Yes. The book will become a reality.. or maybe April 1st will be... /Geggibus "Ignore him"

    Yeah, after I saw the headline my first thought was to check the date and make sure it wasn't April 1st! :)
    http://www.bootyproject.org [bootyproject.org]
  • by DCheesi (150068) on Thursday March 29, 2001 @07:38AM (#331105) Homepage
    I don't know anything about this company, so I'll have to take your word for it. But I don't think this is as implausible as you make it sound.

    I'm assuming that what they're planning is to have a sort of standard library of FPGA loads for different functions, and programmers will write programs by picking the right loads for each device group. This, no doubt, is what that special language is for, so that programmers won't have to understand all the gory details in order to write code for it. Any custom loads that need to be created will be synthesized at compile time; compilation will be slow, but the run-time can be fast.

    Admittedly, programming all those individual FPGAs on the fly is a complex and difficult task, but then, I doubt that most programs will be reconfiguring so often in the real world. Their 1000/s number is a maximum, and may not apply when you're trying to program multiple loads into multiple devices.

  • We're kind of straying off topic here, but NANDs are often preferred because the p-channel transistors are in parallel. N-channel transistors switch faster than p-channel ones, and thus don't suffer as much from being placed in series.

    There's no space or cost benefit of a NAND over a NOR - they're both 2N transistors for N inputs.
  • by bornie (166046) on Thursday March 29, 2001 @05:09AM (#331112) Homepage
    "Since these specific tasks can run in hardware, they will run 1000 times faster than a Pentium. There is no way in the world this machine is going to run general purpose applications at this speed. Only very specific, small, algorithms. Sorry, no 6000 fps for Quake ;-)"

    Humm.. 1000 times faster, 6000fps in Quake with this, do you really mean to imply that you only get 6fps in Quake with current technology? :)
  • by TheOutlawTorn (192318) on Thursday March 29, 2001 @04:57AM (#331121)
    NASA's SETI@Home team has unexpectedly jumped ahead of all other teams, with 3.74 billion work units processed over the last three days. A NASA spokesperson has been quoted as saying "Up yours, Sun Micro!"
  • Um... that Word file tried to change my normal.dot template. Did anyone else encounter this? Is NASA spreading infected Word files?

    For some reason, Word always does that to me whenever I try to open two or more documents at the same time. I don't know why and I wish it would stop, but it doesn't seem to be a virus. (I just scanned with NAV and the document came up clean.)

    --
    BACKNEXTFINISHCANCEL

  • by NiceBacon (202600) on Thursday March 29, 2001 @04:54AM (#331123)
    ... a Beowulf cluster of these. *punch* *ow* *sorry, sorry!* *ow*
  • by Atlantix (209245) on Thursday March 29, 2001 @09:11AM (#331131)
    Absolutely. You don't have to recompile the code everytime you want to turn on an FPGA system. With Xilinx FPGAs, you store the object code in a reprogrammable PROM and on power-up, the FPGA just reads the PROM to find out what it's supposed to do. Altera chips integrate the PROM and keep their programming when turned off so they startup faster.
  • A chip that re-configures itself? 1000 times paster than a pentium 4?

    But what are its specs on the dreaded Q3 fps test?

    "Dr. Chandra, will I dream?"

    "No, but you will be sued to oblivion over your name."

    These guys jumped the gun. April 1 is a couple of days off. [ridiculopathy.com]

  • I've been watching this company since 1999 or so. Back then they were claiming they would have a box on the market priced in PC-range within 18 months. Looks like that's going to remain vaporware for the foreseeable future. Now the only mention I can find on their website about it is this:

    Personal computers. The company believes that some day PCs will come equipped with the same supercomputer technology found in the company's Hypercomputers.

  • I don't know where that unwieldy explanation for "ping" comes from. At any rate, "ping" an onomatopoeia of the sound old-time audible-range sonar makes, not an acronym. As Mike Muuss, the man who created ping says [army.mil], "From my point of view PING is not an acronym standing for Packet InterNet Grouper, it's a sonar analogy."
  • Surely this is some sort of April Fools joke, right? I mean, the damn thing is called HAL. Also.. I'm sure were all aware that the current year is 2001, which would make this sort of joke much more likely. Ratguy
  • I saw an early announcement some years ago referring to this machine. Their performance claims [starbridgesystems.com] were something like a sustained speed of 13 trillion operations per second executing 4-bit adders. The speed quarters when you use 16-bit adders instead; imagine what happens when you try to implement something complex.

    I emailed them about this at the time, but didn't receive a reply 8o)

  • There's an article that talks about FPGA computing in this week's Economist.

    Scroll down to the "Machines that Invent" heading for the really interesting part. David

    http://www.economist.com/printedition/displayStory .cfm?Story_ID=539808 [economist.com]

  • 75% right. "Field Programmable" means it is programmable in the field, rather than mask programmed at the factory. Some FPGA's are based on EEPROM (Electrically Eraseable) or Flash ROM technology, but obviously for this job you want the ones that are based on RAM. That means they have to read their program every time they power up -- that's a disadvantage when you are just using the FPGA as a permanent replacement for a bunch of hardwired logic chips, but perfect when you want to change the program every time you use it.
  • I hadn't been able to open the page with pictures when I wrote this, and I don't think I'll bother now. It does sound like Labview. I program in Labview. It's a good way to design a screen form, but a terrible way to code. I'd rather code in C (or better yet, some higher-level text-based language) with some tool to allow what-you-see-is-what-you-get screen designs, but our biggest customer made the choice of LabView... (I'm not prejudiced against graphic design in general -- when I'm doing circuit designs, I prefer drawing a schematic to coding in VHDL or Verilog. But if you are coding software, text does work better.)

    Besides that, I wonder how well their software really works. From what I've heard about conventional FPGA design software, you code in a C-like language (Verilog or VHDL), then run a simulation to verify the code, then you try to compile it to a physical layout -- and try, and try, and try. If fast operation is needed, you've got to intervene manually to arrange the layout so connections on critcal paths are short. If you want to use even half the gates on the chip, you've got to intervene manually in the layout so it doesn't run out of connection paths in the densest areas. I don't think it likely that these people have found a magic way around that. More likely, their system will only work if you never try to use more than 1/4 of the possible gates or speed...
  • by markmoss (301064) on Thursday March 29, 2001 @07:56AM (#331168)
    "why aren't we all ditching our amd's/pentiums and buying one of these little babies?"
    A) It's only faster on certain problems where the computations can be performed massively in parallel. And most CPU's already spend 99% of their time waiting for data to arrive from memory or the hard drive, or for the operator to click the mouse.

    B) It's a s-o-b to program. You aren't writing software, you are designing a custom hardware circuit to solve the problem, which is then implemented by programming logic gates and connections in the chips. In other words, on a computing job where you could write a program in C in a week and it would run in 1 minute on a PC, on FPGA's it might take a year to design and run in a millisecond. So if reducing the run time is worth paying six figures for software development, go for it... Maybe the HAL people have found a way to ease the programming, but it's still going to be quite a lot harder than normal programming.

    Just guessing this box might hold 100 FPGA's at $25 each. Plus it has to have a normal computer in there to hand the programs and data out to the FPGA's. So it costs more than a PC, but maybe not as much as a top-end workstation (depending on how big a profit margin they are taking). It's great for a rocket navigational system, but the only down to earth applications I can think of for a machine this big are professional video processing, weather prediction, and some really heavy engineering simulations.

    On a smaller scale, cell phones and future modems are likely to include some FPGA-like circuits, probably as a small part of a custom chip rather than as a separate FPGA. When a new protocol comes out requiring revised circuit design, you do the changes in the FPGA program and distribute it to be downloaded.

    No government could stop this; FPGA's are sold worldwide and used extensively for prototyping and occasionally for production. Maybe they'll try to restrict the HAL programming language.
  • by Anonymous Admin (304403) on Thursday March 29, 2001 @05:36AM (#331176)
    "It uses no more energy than a hair dryer" That is 1500 watts. My apartment is small enough that I would have to keep the windows open in the wintertime to keep from roasting in here...
  • Well, at least it's not HAL 95. Or HAL XP. Then we'd know who's really controlling the government.
  • And it's a NAND gate because they are (phyically) the easiest logic gate to build - and the cheapest I'd imagine.

    Claric
    --

  • by garns (318370) on Thursday March 29, 2001 @06:22AM (#331186) Homepage
    I attended the press briefing. First I would like to note that the presentor was a very likable guy who was open to questions and very knowledgeable. He had an example with the HAL computer calculating the Julian set vs. a PIII 850. The difference was amazing. You could zip around the set on the HAL, where the PIII kinda skipped around about 1/3 fps. Finally the price that was quoted 1 millllion dolllars!!! Worth it?? Time will tell.
  • I don't think there's any question that if this becomes mainstream, a fairly comprehensive library of digital logic functions will be developed, similar to C++'s STL or Java's class libraries. The Xilinx software I used in my digital design course already had a pretty good selection of SSI and MSI components (BCD functions, adders, shift registers, etc.), and obviously further libraries would be devloped, both for common algorithms and specialized ones (i.e. scientific).

    BTW, if anyone is really interested in FPGA's, Xilinx [xilinx.com] has a hellass pile of info here [xilinx.com].

    Finally, I wanted to ask any current FPGA users if they find that they get different performance stats on the same design on different compiles. When I was doing work on Xilinx, I found that the compiler would produce designs of various speed, based on routing and the number of CLB's it used. On a couple of occasions, my longest path delay was decreased by about 25% just because i recompiled a couple of times.

  • Current FPGAs don't run faster than general purpose CPUs, Megahertz-wise; actually, nowhere near it. The advantage is that you can do lots of things in parallel, though eg build a massively parallel machine. It's a huge task to take a machine like this (effectively a bunch of empty chips) and make it do something useful.

    The company who makes these computers has been around for a few years.

    As to reconfiguring 1000s of times per second, that seems a bit unlikely. Typically programming time on a Xilinx FPGA is at least a second, in my experience.

    Hamish

    Disclaimer: I work with FPGAs for a living.
  • by Canonymous Howard (325660) on Thursday March 29, 2001 @08:51AM (#331194)
    Ya gotta love it when someone quotes the press release verbatim and it gets modded "funny."

  • I used to work for a company [annapmicro.com] that manufactures a very similar device as an add on card for PCs. True enough, a single transistor on the FPGA in each of these devices is capable of firing much faster than the clock speed of available processors. However, this is the switching speed of a single transistor on the device. When transistors are chained together, you get a phenomenon called gate delay, which is the amount of time each transistor takes to react to its inputs before the output level is changed. So if a single transistor is 1000 times faster than the clock speed of a PII, and we chain 1000 of these transistors together, our usable clock speed is now the same as the PII. Another item of worry for the designers of the image to go on the FPGA is clock tree generation. The clock signal for the FPGA must be generated in such a way that all areas of the chip are synchronized. Very often, the clock tree is the biggest problem in the design as it skews as each route gets longer.

    These devices are fantastic if you have a very specific application that you wish to design them for (e.g. Image processing, voice analysis, SETI@Home). With the ability to be reconfigured at a moments notice, they are also much more reusable than an ASIC. But don't be misled by the speeds given in the marketing info. Get a demo chip from Altera [altera.com] or Xilinx [xilinx.com] and play with it for a while. Then make your own judgements about speed.

Brain fried -- Core dumped

Working...