Intel Pledges 80 Core Processor in 5 Years 439
ZonkerWilliam writes "Intel has developed an 80 core processor with claims 'that can perform a trillion floating point operations per second.'" From the article: "CEO Paul Otellini held up a silicon wafer with the prototype chips before several thousand attendees at the Intel Developer Forum here on Tuesday. The chips are capable of exchanging data at a terabyte a second, Otellini said during a keynote speech. The company hopes to have these chips ready for commercial production within a five-year window."
Shame BeOS Died... (Score:5, Informative)
Not that you couldn't do threading right in Windows, OS X, or Linux. But BeOS made it practically mandatory: each window was a new thread, as well as an application-level thread. Plus any others you wanted to create. So to make a crappy application that locks up when it is trying to do something (like update the state of 500+ nodes in a list; ARD3 I'm looking at you) actually took skill and dedication. The default state tended to be applications that wouldn't lockup while they worked, which is really nice.
Re:Hey now... (Score:2, Informative)
Clarifiation from TFA:
"But the ultimate goal, as envisioned by Intel's terascale research prototype, is to enable a trillion floating-point operations per second--a teraflop--on a single chip."
Further clarification from TFA:
"Connecting chips directly to each other through tiny wires is called Through Silicon Vias, which Intel discussed in 2005. TSV will give the chip an aggregate memory bandwidth of 1 terabyte per second."
Re:Hey now... (Score:3, Informative)
Next!
Re:Moores Law (Score:5, Informative)
Re:Apple and Microsoft and BSD better hurry and sc (Score:5, Informative)
The BSD projects, Apple and Microsoft have five years. Microsoft announced awhile back they want to work on supercomputing versions of windows. Perhaps they will have something by then. Apple and Intel are bed partners now. I'm sure intel will help them.
What this announcement really means is that computer programmers must learn how to break up problems more effectively to take advantage of threading. Computer science programs need to start teaching this shit. A quick you can do it, go get a master's degree to learn more isn't going to cut it anymore. There's no going back now.
But how do they interconnect? (Score:5, Informative)
The big question is how these processors interconnect. Cached shared memory probably won't scale up that high. An SGI study years ago indicated that 20 CPUs was roughly the upper limit before the cache synchronization load became the bottleneck. That number changes somewhat with the hardware technology, but a workable 80-way shared-memory machine seems unlikely.
There are many alternatives to shared memory, and most of them, historically, are duds. The usual idea is to provide some kind of memory copy function between processors. The IBM Cell is the latest incarnation of this idea, but it has a long and disappointing history, going back to the nCube, the BBN Butterfly, and even the ILLIAC IV from the 1960s. Most of these, including the Cell, suffered from not having enough memory per processor.
Historically, shared-memory multiprocessors work, and loosely coupled network based clusters work. But nothing in between has ever been notably successful.
One big problem has typically been that the protection hardware in non-shared-memory multiprocessors hasn't been well worked out. The Infiniband people are starting to think about this. They have a system for setting up one way queues between machines in such a way that appliations can queue data for another machine without going through the OS, yet while retaining memory protection. That's a good idea. It's not well integrated into the CPU architecture, because it's an add-on as an I/O device. But it's a start.
You need two basic facilities in a non-shared memory multiprocessor - the ability to make a synchronous call (like a subroutine call) to another processor, and the ability to queue bulk data in a one-way fashion. (Yes, you can build one from the other, but there's a major performance hit if you do. You need good support for both.) These are the same facilities one has for interprocess communication in operating systems that support it well. (QNX probably leads in this; Minix 3 could get there. If you have to implement this, look at how QNX does it, and learn why it was so slow in Mach.)
CPU speed is not the issue folks! (Score:3, Informative)
No they don't. Right now I'm building a Linux kernel and it is only using approx 35% of the CPU. Why? Because my memory and disk are not fast enough. If I swapped out the CPU and kept everything else the same, it would not go much faster. Sure, with a faster motherboard etc I could get better speed, but that is very difficult to scale to 80 cores
As I said before.... to get 80 cores working properly is going to require huge amounts of memory as well as hugely wide buses out of the chips (say 512 bit-wide buses), huge increases in disk rw speecd etc.
Nobody is going to design 80 core systems unless someone is prepared to buy them and nobody is going to design 80-core chips if nobody can show how to design effective systems with them.
For people wanting to crank SETI etc, it is going to be way cheaper to build a cluster with 20 4-core systems.
Re:Apple and Microsoft and BSD better hurry and sc (Score:5, Informative)
So think more like Cell with 80 SPEs. Great for lots of vector processing.
Re:nVidia should be worried.... (Score:2, Informative)
Bingo. It was called the Micro 2000. A quick Google search dug up a BYTE article from 1991 [byte.com].
According to the article, the processor was to have four CPUs (which today would be refered to as "cores"), a couple of vector processing units, a graphics unit, and 2MB cache.
From what I remember, Intel was advertising the Micro 2000 as the one chip which would take care of all the multimedia (ah, that word reminds me of the 90's) functions, more or less an system-on-a-chip.
True, general purpose processors have always won out, but it seems like we are entering a world with a lot of surplus processing power which may be able to be utlilized for graphics, sound, etc.
Either that, or we'll start having centralized computers with multicore processor(s) to which other computers connecting to do heavy processing on. Something like thin clients [wikipedia.org]...
*sigh* (Score:2, Informative)
Intel's prototype uses 80 floating-point cores.
Very interesting in itself, but not the same as 80 CPU cores, which is hinted at by summary.
Re:But how do they interconnect? MPI (Score:3, Informative)
standardized library: MPI - Message Passing interface. Rather than the OS needing to define a new API, the folks creating high speed interconnects just create optimized libraries (in order to sell their hardware). Folks writing codes for hundreds of processors tend to want to treat them as array elements, so the chaotic calling of procedures just is not that useful. so the RPC support he is asking for is not really important.
In other words, the software stack for using large numbers of processors is already well-known. No need for any new OS features.