Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×

Intel Pledges 80 Core Processor in 5 Years 439

ZonkerWilliam writes "Intel has developed an 80 core processor with claims 'that can perform a trillion floating point operations per second.'" From the article: "CEO Paul Otellini held up a silicon wafer with the prototype chips before several thousand attendees at the Intel Developer Forum here on Tuesday. The chips are capable of exchanging data at a terabyte a second, Otellini said during a keynote speech. The company hopes to have these chips ready for commercial production within a five-year window."
This discussion has been archived. No new comments can be posted.

Intel Pledges 80 Core Processor in 5 Years

Comments Filter:
  • by Lost+Found ( 844289 ) on Tuesday September 26, 2006 @04:24PM (#16204895)
    This is hilarious, because if this goes out on the market there's not going to be many operating systems capable of scheduling on that many chips usefully. OS X can't do it, Windows can't do it, and nor can BSD. But Linux has been scheduling on systems with up to 1,024 processors already :)
  • by davidwr ( 791652 ) on Tuesday September 26, 2006 @04:29PM (#16205019) Homepage Journal
    Today, a 2 CPU x 2core computer can actually be slower than a 2x1 or 1x2 core for certain "cherry picked to be hard" operations due to the OS making incorrect assumptions about things like shared/unshared cache - 2 cores on the same chip may share cache, two on different chips may not - and other issues related to the fact that not all cores are equal as seen by the other cores.

    In an 80-core environment, there will likely be inequalities due to chip-real-estate issues and other considerations. The question is, will these impacts be felt at the code level, or will the chips be designed to make these differences invisible? If the former, will the OSes be designed to use the cores efficiently, or will they simply see "80 cores" and, out of ignorance, make poor decisions when allocating tasks to various cores? If the latter, what performance penalty will be incurred?
  • by Bluesman ( 104513 ) on Tuesday September 26, 2006 @04:41PM (#16205271) Homepage
    How many processes are running on your machine?

    A basic strategy would be for the OS to devote each process to its own processor.

    This would reduce the need for TLB/cache flushes or eliminate context switches entirely. The whole machine would be really snappy.

    That said, for a desktop machine, this is a huge amount of overkill, but with economies of scale being what they are, we'll probably have this power available soon.

    What I'd like to see more though, is extra functionality in hardware rather than more of it. Wouldn't it be great if hardware was able to handle some of the things an OS is now used for, like memory (de)allocation? Or if we could tag memory according to type? Or if there were finer-grained controls than page-level?

  • by stonewolf ( 234392 ) on Tuesday September 26, 2006 @04:51PM (#16205475) Homepage
    A couple of things to mention here. Many years ago I read an Intel road map for the x86 processors. It was more than 10 years ago, less than 20 I think. In it they said they would have massively multicore processors coming along around now. They may have forgotten that and reinvented the goal along the way, companies do that. But, they really have been predicting this for a very long time.

    The other thing is that with that many cores and all the SIMD and graphics instructions that are built into current processors it looks to me like the obvious reason to have 80 cores is to get rid of graphics coprocessors. You do no need a GPU and a bunch of shaders if you can throw 60 processors at the job. You do need a really good bus, but hey, not much of a problem compared to getting 80 cores working on one chip.

    With that kind of computer power you can throw a core at any thing you currently use a special chip for. You can get rid of sound cards, network cards, graphics cards... all you need is lots of cores, lots of RAM, a fast interconnect, and some interface logic. Everything else is just a waste of silicon.

    History has shown that general purpose processing always wins in the end.

    I was talking to some folks about this just last Saturday. They didn't beleive me. I don't expect y'all to believe me either. :-) The counter example everyone came up with was, "well, if that is true why would AMD buy ATI?" The answer to that is simple, they want their patent portfolio and their name. In the short term it even makes sense to put a GPU and some shaders on a chip along with a few cores. At the point you can put 16 or so cores on a chip you won't have much use for a GPU.

    Stonewolf
  • by mad_minstrel ( 943049 ) on Tuesday September 26, 2006 @05:24PM (#16206097)
    If I was writing a game and knew that the majority of gamers had at least 80-core CPUs I would:

    - Dedicate 45 cores to the opponent AI (which would run on simple neural nets)
    - Dedicate 20 cores to physics (because physics is the next-big-thing)
    - Dedicate 8 cores to keeping the former fed with usable data (like game logic, asset management, etc)
    - Dedicate 4 cores to 3d sound (because with so many cores it's cheaper for me to develop the sound myself than license the latest EAX from Creative, or whatever's hip at the moment.)
    - Dedicate 1 to networking and voice-chat (because the better the compression, the better the experience)
    - Dedicate 1 to coordinating the rest.
    - Leave 1 for the OS and any parallel tasks.

    Oh and not having to make my code terribly efficient would cut my development costs a lot.
    So that's that for using 80 cores. Sure could use more in the AI department.
    And the advantages of an 80-core chip over 40 2 core chips? A hell of a lot of physical space.
  • In other words, get out your functional languages like Haskell and OCaml and use the side-effect free feature set to develop multi-threaded programs. Or do it the hard way with an OO language.
  • by sp3d2orbit ( 81173 ) on Tuesday September 26, 2006 @05:55PM (#16206647)
    I remember doing a project in college where we had to implement a 8 point FFT in software and hardware. I was eye-opening. The hardware implementation ran on a FPGA that had something like a 23Mhz clock. The software solution was a C program running on a 2Ghz desktop. 23 Mhz vs. 2 Ghz. The hardware solution was more than 10X faster.

    I don't think that general purpose processors will ever completely replace special purpose hardware. There is simply too much to be gained by implementing certain features directly on the chip.

  • Re:Not enough demand (Score:3, Interesting)

    by modeless ( 978411 ) on Tuesday September 26, 2006 @07:06PM (#16207721) Journal
    Video processing is why consumers will eventually want 80-core chips. Many video algorithms are extremely parallelizable. Heck, modern video cards have double-digit numbers of shader units already, and consumers buy them. Generating video images in real-time is extremely parallelizable. Software rasterizers could easily use 80 cores. More excitingly, real-time raytracing would be feasible with 80 cores; no video card required. HD videos tax modern single cores just when being decoded, and encoding is glacially slow. Both are easily parallelizable to 80 cores and both will be demanded by consumers in the future.

    Personally, I also believe that the people blaming software for the failures of AI are wrong, and that multi-core computing will also finally enable some interesting applications like usefully robust speech recognition, object recognition in images, 3D reconstructions from video footage, stereo-vision-based navigation for robots, and other cool stuff we haven't thought of yet. All that's still a little farther out though.
  • by Anonymous Coward on Wednesday September 27, 2006 @12:10AM (#16210457)
    Take the most fundamental arithmetic and logic blocks, mix DRAM in locally, with massively parallel interconnections capable of asynchronous data transfer and data-driven clocking (i.e. the clock ticks as soon as the result is ready), and a compiler designed for the parallel environment which is race-condition aware. Basically, your design won't even run unless you have properly described the data dependencies. Add in some engineering training that teaches pipelining and concurrency. Replace stochastic scheduling with determinism. FPGA. Here's an example which I already linked elsewhere in this discussion: http://www.xtremedatainc.com/Products.html [xtremedatainc.com]
  • by anon mouse-cow-aard ( 443646 ) on Wednesday September 27, 2006 @02:03AM (#16211099) Journal
    They are claiming a terabyte per second interconnect. I think it is safe to assume it will be something like an Infiniband, myrinet or similar (NEC's IXS, IBM's HPS) high performance application networking technology.

    What you're asking for is pretty standard stuff in the high end, where hundreds of processors is quite common. Cache coherency is a killer, and so they have died out long ago in the high end. when you think about it, CC basically requires a crossbar switch style memory archictecture which expands with the square of the number of processors, and much higher speed logic to resolve conflicts. So eventually, it doesn't scale. Instead multiple applications with large numbers of processors tend to only have small groupings (say 8 or so, but can go upto 30 odd) using shared memory/cc access) and then MPI for anything bigger.

    Clusters have been using MPI for years for this sort of
    thing. all the custom interconnects for supercomputing
    have customized implementations in their MPI libraries to
    take advantage of 1-sided communications. Most use a facility which can loosely be termed RDMA - remote direct memory access, another word sometimes used is OS-bypass. The idea is that for this sort of communications, you want to skip the TCP/IP stack and other OS buffering overhead, and just have straight memory to memory copies going on (under userland library control.)

    folks generally don't do the direct invoking of things on other processors, but instead fire off jobs on blocks of processors, and have them communicate with 1-sided primitives. This is the sort of thing done on hundreds or thousands of processors today. It will just gradually percolate down to normal applications.

  • by losec ( 642631 ) on Wednesday September 27, 2006 @08:41AM (#16213091)
    Erlang already support multiple cores, it used to be that you had to start an erlang node for every core/cpu. Today a single erlang os-process will scale to the cores available.

    But from the programming point of view, erlang has supported multiple cores for as long as it has existed.

    In erlang when you send a message to another process, you don't know if that other process is executing within the same OS-process or is executing in another OS-process, or even running on an distant machine. That is very good, because you don't have to rewrite anything to support distribution.

    Concurrency is one of many features where erlang makes things easier for you.

    On the other hand.. if you look at java or perl, those green processes or real threads (pthread) is just slapped on, like any other library. The language doesn't have any support for processes/threads nor is it oriented around processes/threads. This doesn't exclude you from doing erlang like stuff. It is just that an 10 line erlang-hello will expand to an 100 line pthread beast with sharp mutexes slashing around.

    References:

    New smp feature in erlang:
    http://www.erlang.org/doc/doc-5.5.1/doc/highlights .html [erlang.org]

    Look at chapter 3, you don't have to understand erlang to follow:
    http://erlang.org/doc/doc-5.5/doc/getting_started/ part_frame.html [erlang.org]

All seems condemned in the long run to approximate a state akin to Gaussian noise. -- James Martin

Working...