Forgot your password?
typodupeerror

48 Core Vega 2 in the Making 206

Posted by ScuttleMonkey
from the packed-in-like-sardines dept.
TobyKY76 writes to tell us The Inquirer is reporting that upstart Azul Systems is planning to integrate 48 cores on their next generation chip. From the article: "The first-generation Vega processor it designed has 24 cores but the firm expects to double that level of integration in systems generally available next year with the Vega 2, built on TSMC's 90nm process and squeezing in 812 million transistors. The progress means that Azul's Compute Appliances will offer up to 768-way symmetric multiprocessing."
This discussion has been archived. No new comments can be posted.

48 Core Vega 2 in the Making

Comments Filter:
  • Oh my God, 768 way processor!!! I almost had a fit when I saw suns http://www.sun.com/smi/Press/sunflash/2006-03/sunf lash.20060321.3.xml'>t1 processor I need to sit and catch my breath, this is a multithreader's wet dream. Hmmm I wonder how oracle will price running on this processor?
    • Re:Oh wow!! (Score:3, Funny)

      by AuMatar (183847)
      If you have to ask, you can't afford it.
      • Re:Oh wow!! (Score:2, Interesting)

        by liliafan (454080) *
        lol lol, hmmm $40,000 per processor / 2 for virtual processors $20,000 * 768 = $15,360,000 licence fee per chip, no wonder Larry isn't worried about the sales of oracle stagnating, as soon as people upgrade from their old duel core to these he beats billy boy in the billionaires list :op
    • Huh? Only 768? Bah!

      I will be designing and selling an Earth Simulator on one chip. Only $50,000,000,000...paid in advance.

    • You're far too trusting. Remember, this is the same "online journal" that printed the Google Office Confirmed [theinquirer.net] scoop right after a press conference that announced nothing of the sort.

      The Inquirer is taking Azul's word for it at the moment, which is probably why the article is so light on details. About a billion questions pop in my mind when I hear a story like this. The only answer I get is that, "Sun is banging on Azul's door for IP infringement."

      Sure.

      Does anyone have any real info on these guys? About all
    • If you've ever read oracles policy on per processor license.
      http://www.oracle.com/corporate/press/2005_dec/mul ticoreupdate_dec2005.html [oracle.com]

      You would see that it depends upon your architeture. For example when running on UltraSpac you pay for one processor for every 4 cores you have. AMD/Intel multicore you pay for every 2 cores you have. Either way they would have to sit down and devise a per processor license for you. Though you can always purchase per user instead of per processor. This would probably be the
      • Though you can always purchase per user instead of per processor. This would probably be the best route if your runnings hundreds or thousands of processors.

        That assumes you don't also have 80K+ users :-)
        -nB
    • Oh my God, 768 way processor!!!

      woo hoo... at last... something that has processors left over after allocating one to every task running on this Kubuntu box... ;)

  • Now I can finally deal with the 768 or so pop ups that come up whenever I attempt to use my computer!!
  • AutoCAD (Score:5, Funny)

    by turtleAJ (910000) on Tuesday March 28, 2006 @04:40PM (#15013101)
    Behold the power of copy/paste!


    Yeah, yeah... my Karma is SUPER negative...
    • by DragonHawk (21256) on Tuesday March 28, 2006 @06:27PM (#15013923) Homepage Journal
      Right now, it seems like they're a company, but they don't have a product yet. I guess that means...

      There is no Vega only Azul.

      (Thank you, thank you, I'll be here all week.)
    • Um... the correct autocad command would be array.

      Sorry for being one of those lame-asses who corrects others jokes. It just rankled me a bit to think of a million copy and pastes --- array is a nice command if you're a drafter. the polar array option comes in handy more often than you might think.
  • by Raul654 (453029) on Tuesday March 28, 2006 @04:40PM (#15013102) Homepage
    I know of a certain project [wikipedia.org] that's working to put over a million cores into a system (160 into a single chip), and it should be finished and available off-the-shelf within a year or so.
    • The wikipedia link itself says 80 cores per chip not 160.
  • Finally! (Score:5, Funny)

    by lax-goalie (730970) on Tuesday March 28, 2006 @04:42PM (#15013112)
    Enough CPU power that even Microsoft Office will run with a little pep!
  • I read the article, and I was a little confused on that point. I would think that they are definately not.
    • They are not x86 compatible. They are RISC like chips with an instruction set optimized for for running VM based applications like Java and .NET.

      That said, its still very impressive to get that many cores working together, though not as impressive as x86.
    • They're not x86 compatible. They're special purpose chips for Network Attached Processing devices for driving up utilization on Java applications. For another approach to driving up utilization on server-side Java applications you should check out Cassatt [cassatt.com].

      Cassatt was discussed on Slashdot a few weeks ago [slashdot.org].

      -Steve

  • Not for nothing... (Score:4, Insightful)

    by Red Flayer (890720) on Tuesday March 28, 2006 @04:45PM (#15013125) Journal
    But I'd tend to take a website's articles with a grain of salt when the links at the bottom of the page are:

    "Home Discuss on our Forum Flame Author

    Recommend this article Print"


    Sounds to me like someone issued a press release and wants a share of the excess VC floating around... and the Inquirer took the bait. They did a good job of not loading the buzzwords, though -- they didn't say they would 'leverage their experience with graphics chip design' or anything like that.

    I'd expect this company to turn around and sell out to AMD or Intel at the earliest opportunity, if given the chance.
  • 768 cores. Finally, a box that will run Everquest II.
  • 768 cores, why? (Score:2, Insightful)

    by Coopjust (872796)
    Dual cores, quad cores, whatever, I can understand that for multitasking and programming. But 768 cores? What would possibly use that many cores? And for any single task, the thing would not be efficient. What exactly is the point of this? Bragging rights?
    • "...But 768 cores? What would possibly use that many cores?..."
      If you'd RTFA, you'd note that they're building chips with 48 cores, not 768.

      To answer your question, anyways: More cores == better efficiency == less heat == lower electric bill. Desktop users may not need a 48-core chip (yet), but server farms love designs such as this.

    • Think virtualization ....
    • Re:768 cores, why? (Score:4, Insightful)

      by thrillseeker (518224) on Tuesday March 28, 2006 @05:08PM (#15013293)
      What would possibly use that many cores?

      Any task that can be split into multiple processes. An example is an array of data, where a single algorithm is going to be applied to each element. An array of data can represent anything - an image, or stock prices, or DNA, etc.

      for any single task, the thing would not be efficient.

      It depends on what you really mean by a single task - a given process consists of multiple sequential tasks, where a task may be as fine-grained as a single CPU operation, or perhaps due to overhead of communication between tasks, a tuning effort can be made to say a "task" is some multiple of operations.

      • My first thoughts were graphics applications, particularly 3d and rendering apps. This could be especially useful when it comes to things like raytracing for an animation, where you could dedicate parts of the work to different cores. Kinda like having a server farm all in one machine...
    • But 768 cores? What would possibly use that many cores?


      Think "real-time raytracing". Think computer games with real shadows from multiple lights, with curved mirrors, with glass that actually bends light.

      Think about a first-person shooter where you can notice someone sneaking up behind you when you spot his reflection off the hood of the car you're using for cover.
    • "What exactly is the point of this?"

      Simulations, of weather, bio/chemical/nuclear reactions, searching for extra terrestrial signals in the radio spectrum (okay, kidding about that one)... sciency stuff :-p

    • yeah, frankly, 640 cores should be enough for everybody, no?
    • What would possibly use that many cores? And for any single task, the thing would not be efficient. What exactly is the point of this?

      Remember, the human brain has 100 billion cores, each in itself very inefficient, and yet it is pretty powerful.

      There are huge amounts of unexploited parallelism in the tasks our computers perform. The problem is mainly that most of the tools we use (programming languages etc.) are very serial in how they describe and handle problems and solutions. This is natural, of cou

    • DNA work, or other easily parallized processes.. For high end work.. Its not so you can run word or something.
    • Keep 1280 cores at a 90-100% load, 24 hours a day, 365 days a year. They show no sign of stopping.

      It is called supercomputing.
    • Ever used SETI or UD? Those are running on hundreds of thousands or even millions of processors. This is just faster.
    • What would possibly use that many cores?
      The simple answer is, "anybody who currently has a cluster or server farm, plus those who would like to if only they were cheaper and more power efficient."
    • Many servers (http and databses) use one process per client. It's not uncommon to see 20 copies of Apache running on a web server. I typically see a few copies of ostgreSQL running on my system.

      for desktops video rendering can uses as many CPUs as you can get. Many desktop apliations are now written to tae advantage of multiple CPUs. Check out Apple's iTunes. I've seen iTunes run a dozen threads.

      huge numbers of CPUs, as in "thousands" will likely be required for true AI.

    • Many others have posted their proposed uses for massively parallel processing units. Personally I see this as a great server platform for running software that is already optimized for multiprocessor architectures. But you also bring up a good point in that most software is not optimized for multithread CPUs, especially on the desktop.

      A few years ago only expensive servers had multiple processors. Now every major CPU maker produces a multicore chip as their flagship product. Many of these chips have hit
  • by Revellion (803549) on Tuesday March 28, 2006 @04:52PM (#15013171) Homepage
    Would'nt wanna see how htop looked on that possible SMP setup... I would have to page down atleast 80-100 pages of per-processor load meters before i got to the processes list :S
    • I expect the process list will have to be kept on a high end external database server so that reasonable performance metrics can be collected and monitored.
  • by Seanasy (21730) on Tuesday March 28, 2006 @04:52PM (#15013173)

    So, chip manufacturer's have adopted the Gillette approach to marketing chips. I guess it was inevitable after they went from one core to two. The only difference, I expect, is that they'll increase by powers of 2. Soon, we'll have a Intel Mach 512 Core Sensor Extreme or something :P

  • It would seem to me, that a CPU's workload is roughly limited by the number of transistors it has multiplied by it's MHz speed. No matter how many cores one has, the transistor count should remain roughly the same for a 1-core, 2-core, 8-core chip of the same nm process and is limited by that process (90 nm in this case).

    I would suppose (but am not sure) extra cores reduce the number of transistors being idle at any one moment. The downside would seem that each extra core reduces the capability to process
    • It would seem to me, that a CPU's workload is roughly limited by the number of transistors it has multiplied by it's MHz speed.

      The number of transistors can go up for a variety of reasons. Chief among them is designs that utilize complex performance enhancements. To name a few:

      • Superscalar processing
      • Branch prediction
      • Hyperthreading
      • Out of order instructions
      • Pipelining


      The secondary source of transistor usage is coprocessors like Floating Point Units and SIMD Units.

      The latest craze in processor design is to simplify the microprocessor back down to the most basic level. From there, the processors are ramped up through shear numbers of parallel pipelines (i.e. threads) and cores as opposed to ramping up the individual CPU horsepower. These multi-core chips typically share coprocessors among a pipelines or cores, and may even have entire cores dedictated to specific tasks like SIMD. As a result, a properly designed program will be able to execute within a very short period of time, thanks to the parallel nature of the multi-core architecture.

      Now the only problem is in finding these "properly written programs".
      • properly written programs

        Almost.

        Sooner or later you have to branch, or speculate, or go to main memory (or *gasp* I/O). So exploiting parallelism in arbitrary code is quite a challenge. Having worked on optimizing the linpack benchmark for a small "supercomputer" (64 CPUs) during college, it is quite a bit of work to decompose a problem this way.

        However, the trend is not "well written" parallel apps, but rather parallel process scheduling: like antivirus + network stack + device drivers + MP3 player + exc
    • I don't know much about CPU internals but

      So if you don't know CPU internals, why make these statements:

      It would seem to me, that a CPU's workload is roughly limited by the number of transistors it has multiplied by it's MHz speed.

      - NO, number of transistors has nothing to do with it.

      No matter how many cores one has, the transistor count should remain roughly the same for a 1-core, 2-core, 8-core chip of the same nm process and is limited by that process (90 nm in this case).

      - NO, transistor c

      • No matter how many cores one has, the transistor count should remain roughly the same for a 1-core, 2-core, 8-core chip of the same nm process and is limited by that process (90 nm in this case).

        - NO, transistor count WILL increase with the increase of the number of cores.

        But it would seem to me, that for the same sized chip, regardless of number of cores, the process (90 nm in this case) limits the number of transistors.

        • But it would seem to me, that for the same sized chip, regardless of number of cores, the process (90 nm in this case) limits the number of transistors.

          Correct, that would set the upper limit if everything was constant. But the article does not state what process was used in the first-generation chip:

          The first-generation Vega processor it designed has 24 cores but the firm expects to double that level of integration in systems generally available next year with the Vega 2, built on TSMC's 90nm proces

        • There is no limit to the number of transistors you can fit on a chip. There is, however, a limit to the number of transistors you can economically fit on a chip. As you increase the transistor count, you dramatically decrease the yield, and drive up the cost. If you want your die size to be more than a 300mm radius circle then you are going to have to build a new fab (several hundred million dollars) and then expect yields of a few percent (and even then, you will have some defects on each chip).

          Beyon

      • It would seem to me, that a CPU's workload is roughly limited by the number of transistors it has multiplied by it's MHz speed.

        - NO, number of transistors has nothing to do with it.

        Your statement is just as wrong as his. Yes, you can design a faster CPU given more transistors - up to a point. (If more transistors don't help, why do you think CPUs have had increasing transistor count for entire history of digital computing?) Only recently have designers failed to find new ways to use more trans

  • A new beginning (Score:3, Interesting)

    by jellomizer (103300) * on Tuesday March 28, 2006 @04:57PM (#15013209)
    Now except for the MHZ/GHZ wars the new standard will be how many cores your processor has. By 2026 My PC has 2k cores while your PC has only 1.5k cores, Thus my PC is superior, It will be just a pointless as comparing PCs using MHZ/GHZ
  • How well do these multi-core chips fail? Do they fail silently? Do they come crashing down if even a single core on them fails?

    Are we putting too many eggs in one basket? I thought modular design was good.

    Oh well, back to setting up Linux on old dell boxes. Maybe I will get a real server one day. *grin*
  • extreme (Score:3, Funny)

    by matt328 (916281) on Tuesday March 28, 2006 @05:02PM (#15013253)
    If Intel's two cores are 'Extreme Edition' what should these be called? Ludicrous Edition? With a little sign by them saying 'Never Use'?

    Trying to be humorous, not seriously comparing the two chips.
  • by Tired and Emotional (750842) on Tuesday March 28, 2006 @05:07PM (#15013284)
    So what does the memory interconnect look like on this thing? They say its not NUMA but I see no mention of what it is.

    There's no way you can feed that many processors over a single bus and if you've got symmetric access to a bunch of busses, that's one heck of a cross bar switch and I don't see that its any easier to program than NUMA. Instead of making sure data you need fast is local you have to make sure you load balance - that has to be harder much of the time.
    • Not knowing much about processor design, I'm going to talk out my ass here for a second; bear with me.

      First off, not every core needs to be as powerful as a AMD or Intel core. There are some problems that are easier to solve using a lot of low-power cores vice one or two uber cores.

      You could also reduce the memory bandwidth to each core. You could keep a fairly powerful core, while only feeding each core a limited bandwidth.

      You could also completely change the idea behind how the proc works. Maybe they c
    • I would think that a crossbar switch would be completely transparent to the OS or app programmer. The bus for the original Athlon (IIRC, Alpha's EV6 bus) was a crossbar switch system and I don't remember any changes to the OS, apps, drivers or anything difficult to support it.

      I've never heard of these companies or projects, so until they demonstrate something or show some credible people in the project, I'll file it in the same category as Infinium Labs and Duke Nukem. I know the current well-known chip m
  • "However... (Score:1, Troll)

    by fak3r (917687)
    reports show that Java still runs dog slow, even with the 768-way configuration."

    Ah, the more things change, the more they stay the same!
  • Neat stuff. (Score:4, Informative)

    by SoupIsGood Food (1179) on Tuesday March 28, 2006 @05:15PM (#15013325)
    I'm not as impressed by the sillicon as I am by their product... it's a platform-agnostic application accellerator, designed to make Java apps (or any other VM app) optimized for multithreading go like stink. It does for processing power what a storage server does for disk space. Plug it into the network, and go... all it does is run a gajillion threads for the VM living on your general purpose servers. Each core probably isn't very powerful (altho they are 64bit RISC designs), but if you're in dire need of cramming as many lightweight transactions through as possible, lots and lots of little optimized processors are going to be more help than one or two big, fat general-purpose Opterons.

    It's a very neat concept, and the careful wording ("virtual machine accellerator") indicates that they aren't tied to just Java... Azul's Compute Pool could be something future Parrot-lovers can use to sneak LAMP into places where Java rules all.

    They're using some serious sillicon know-how to fuel an innovative and original product... gives me hope we aren't doomed to a wintel-only world, after all.
    • Their CEO is Steven Dewitt, former CEO of Cobalt Networks, so he's got the "why buy an appliance?" sales pitch down pat.
      • Azul is profiled in the issue of Forbes that's on newsstands now. Unfortunately, I found it rather shallow technically (which isn't surprising for Forbes).
  • So which is better? Vega, or IBM/Toshiba Cell Processor?
  • 768 Threads = 768 Memory Accesses.
  • Wow!... (Score:2, Funny)

    by Illbay (700081)
    ...imagine a Beowulf cluster of...

    Ah, never mind.

  • What I would wonder about is how this would work for heat. From a software perspective dual-core chips appear as multiple processors, as it seems to essentially be having multiple processing units in a single die (correct me if I'm a bit off here, searching for the simplest answer).

    The issue I see with this is:

    Multiple processors generate more heat, and consume more power. Would it not be the same for multiple cores, thus making such a machine a power-chugging space-heater? Are special cooling devices r
    • The practical limit for dissipation by air-cooling is around 100-150 Watts. Heat is a function of the number of transistors which are powered on *and* the clock rate. If you want to make a single core chip run faster there really only two options -- 1) make more fancy circuitry (branch prediction, new instructions, etc) on the chip but only switch it on when necessary and when heat constraints allow you to and/or 2) ramp up the clock rate. Intel put their money on #2 and they hit the heat wall, hard. But
  • by TheNarrator (200498) on Tuesday March 28, 2006 @08:05PM (#15014598)
    Have any onf you people visted Azul's website? This is not an Intel compatible machine.
    It is going to only run a Java Virtual Machine so anything written in Java will run on it.
    Windows will not run on it. I took some operating system courses in college and the intel
    architecture is a huge mess of hideousness of backwards compatibility that luckily only operating system implementers have to deal with. By only running Java these guys get to sidestep the whole mess and focus
    on massively optimizing the hardware architecture for running java code.

    http://www.azulsystems.com/products/nap.html [azulsystems.com]
  • by Khashishi (775369) on Tuesday March 28, 2006 @08:18PM (#15014656) Journal
    Languages like c are inherently serial. All statements must be placed in sequential order even when there are no serial dependencies between the two. E.g.
    x = zeta(y)
    w = gamma(z)
    print(x+w)
    The code explicitly states that x should be calculated before w although they could certainly be calculated concurrently. Of course a smart compiler could figure out the dependencies, but the programming language shouldn't force the programmer to specify an order when none exist.

    I predict that non-procedural languages will dominate the future of programming. Some currently used languages seem already well-suited for taking advantage of multiple cores, like HDL languages, functional languages, Labview-style languages.

  • Size (Score:3, Interesting)

    by necro81 (917438) on Tuesday March 28, 2006 @09:15PM (#15014941) Journal
    For comparison purposes, the Cell processor in the PS3 has something like 100 million transistors, comes from a 90 nm process, and has a die size of about 1 cm square. The Cell has a modestly-sized cache, which means that its transistors are mostly given over to functional blocks. This is in contrast to something like a P4 Extreme edition, which has a higher transistor density because more than half its die is cache memory.

    TFA does not mention anything about this new processor's die size. But, if we scale up the Cell processor's transistor density, the Vega processor, with 812 million transistors, would result in a die size of about 800 mm^2, which is more than one square inch. In the processor industry, that kind of die size is just plain ridiculous. I wonder what the yields are?

    • TFA does not mention anything about this new processor's die size. But, if we scale up the Cell processor's transistor density, the Vega processor, with 812 million transistors, would result in a die size of about 800 mm^2, which is more than one square inch. In the processor industry, that kind of die size is just plain ridiculous. I wonder what the yields are?

      If those transistors are most of cache then the yields are pretty good. If they are logic then there needs to be more clarification. Firstly the

  • "The desktop is the computer."

You knew the job was dangerous when you took it, Fred. -- Superchicken

Working...