Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Technology

A Look Into The Cell Architecture 318

ball-lightning writes "This article attempts to decipher the patent filed by the STI group (IBM, Sony, and Toshiba) on their upcoming Cell technology (most notably going to be used in the PS3). If it's as good as this article claims, the Cell chip could eventually take over the PC market."
This discussion has been archived. No new comments can be posted.

A Look Into The Cell Architecture

Comments Filter:
  • Dataflow squared (Score:5, Interesting)

    by Space cowboy ( 13680 ) * on Saturday January 22, 2005 @11:53PM (#11445725) Journal

    The original PS2 design was for a dataflow architecture - the Cell is a continuation (and significant evolution) of the theme. Interestingly enough, if this *does* take off it may be that the best programmers of tomorrow turn out to be the PS2 low-level guys, who've already written the algorithms that are about to be important.

    In the PS2, the MIPS chip was there mainly to do the simple stuff, all the heavy lifting was done on the 2 vector processors, and they were designed to have programs uploaded into them and data streamed through them using a very flexible (chainable) DMA engine. Sounds similar (if in a limited sense) to the Cell chip itself.

    Simon.
  • by auzy ( 680819 ) on Saturday January 22, 2005 @11:56PM (#11445739)
    its very rare for a system to be able to be completely parallelised.

    There will always be "critical sections", data which can only be used by 1 thread at a time, which limits how much it can be split up.. Then you have programs which cant be.. I mean, you can split up a game for instance into a sound, video, and keyboard threads easily. To really utilise parallel processing takes a massive amount of code, which with current languages, seems to make it a bit implausible to get a massive increase.

    It should also be remembered that the G5's and G4's already have altivec, and even though this is on a much grander scale, there will always be bottlenecks that slow it down preventing 99% of commonly used apps from getting a significantly large increase..

  • by Rares Marian ( 83629 ) <hshdsgdsgfdsgfdr ... tdkiytdiytdc.org> on Sunday January 23, 2005 @12:07AM (#11445778) Homepage
    A measly 68k CPU with hardware that was autonomous.

    A measly MIPS with hardware that is autonomous.

    The only thing they need is to sync to the TV set.
  • Re:Transmeta (Score:3, Interesting)

    by kai.chan ( 795863 ) on Sunday January 23, 2005 @12:36AM (#11445912)
    There's nothing revolutionary here

    There is a _lot_ of revolutionary ideas behind the Cell processor. As shown in the write-up, the Cell takes a drastic change from the conventional arithmetic-unit/cache setup. Additionally, the way the Cell can pipeline parallelizable problems amongst the 8 processing units within itself is a revolution of chip design already. Take, for example, the video encoding/decoding example shown in the write-up, whereas an an Intel chip will require processing of each procedure in sequence, the Cell can separate each procedure, pipeline the process, and produce results in a fraction of the time it takes an Intel chip. Since much of our processing power in home electronics goes into Video, Audio and 3D Visualization (all of which are highly parallelizable), being able to separate tasks onto separate processing units dramatically increases the speed of computation.

    Add to the fact that you can also pipeline processes amongst Cells within one piece of electronic, or spread the problems to multitude of other home electronics, makes the design a much different type of processor than the everyday Intel and AMD. The way to "upgrade" the Cell is also revolutionary, as buying another piece of electronics will increase the processing power of your household.
  • Re:Its a dupe (Score:3, Interesting)

    by Peyna ( 14792 ) on Sunday January 23, 2005 @01:00AM (#11446005) Homepage
    The words duplicate and triplicate actually vary little from their Latin roots; duplicatus and triplicatus; "to double" and "to triple." THe word "dupe" isn't officially recognized as a synonym for duplicate, so the argument is pretty much moot.

  • by mcc ( 14761 ) <amcclure@purdue.edu> on Sunday January 23, 2005 @01:27AM (#11446097) Homepage
    I've had for a very long time the suspicion that the XBox was basically just a big blindside at Sony. The XBox loses a huge amount of money, and looks as if it will continue to lose a huge amount of money right into the XBox 2 line; Microsoft must be doing this for some reason. My personal theory for awhile has been that at least one of Microsoft's motivations in spending all this money is because they see the Playstation as a potential future threat; i.e., they feared and fear that at some point the Playstation 2 or 3 or 4 will become so close in power and functionality to a PC that it will begin to supplant the PC for common tasks. This would be disastrous for Microsoft; their lockdown on the PC market is complete, but this doesn't protect them from the PC market itself being slowly eaten away at from the bottom by consumer electronics like the ones Sony makes. So to stave off this threat, Microsoft begins to instead grow the PC market it monopolizes downward, so that the PC (as it becomes the "Windows Media Center") begins to slowly suck up the consumer electronics market, competing directly with the Playstation, bringing the fight to Sony's door instead of Microsoft's. Since consumers wouldn't on their own be interested in a PC that supplants consumer electronics, Microsoft instead basically bribes them into being interested with subsidized hardware; they make a big money blackhole out of the XBox to undercut Sony's ability to maneuver with the Playstation, the way the money blackhole that was MSIE undercut Netscape's ability to maneuver.

    This is, of course, all just conjecture.

    But when I begin to see people seriously talking about the chip from the Playstation 3 eventually potentially being used in PC hardware, I begin to wonder if it's maybe reasonable conjecture...
  • 3 architectures (Score:5, Interesting)

    by SunFan ( 845761 ) on Sunday January 23, 2005 @01:34AM (#11446133)

    It's been said before, but mature industries tend towards three of something, such as GM-Ford-Chrysler. For CPUs, it has to be AMD64/ia32e, PowerPC, and SPARC. They're the only ones with any high-volume prospects. SPARC will certainly be in third place, with AMD64/ia32e and PowerPC duking it out for one and two. The fact of the matter is that Itanium won't be a mainstream processor, and PA-RISC, Alpha, and MIPS are all more-or-less EOL.

    For operating systems it will still be Windows, Linux, and UNIX (predominately Mac OS and Solaris). Okay, that's four, but the other historical major players are all becoming niche legacy platforms.

    For office suites, it'll be MS Office, StarOffice/OpenOffice.org, and iWork. The others are all niche players.

    For browsers it'll be IE, Firefox, and Safari.

    At least this will tend to simplify some things, because the non-Microsoft platforms will be fewer making supporting them easier. This is a good thing, IMO.

  • by Space cowboy ( 13680 ) * on Sunday January 23, 2005 @01:47AM (#11446191) Journal
    Hmm - short answer: Don't know :-)

    I *think* the programming model will be sort-of-like CORBA, with 'messages' being sent from a central despatcher (the G5 probably, though it could be another APU). I think the messages will be self-contained program+data though - they've even called them APUlet's. The OS then schedules them to be executed on the first available APU.

    The message is the data, but the code will be bundled along with it, and when it's finished, it'll send another message back to the despatcher (or 'return' some value, depending on how you view these things). In a traditional messaging system, the code is fixed. In this paradigm you get to change the code as well as the data - could be a nightmare to debug, but the flexibility is staggering.

    So, yes, I think messaging systems will be the way this pans out. I wonder if Apple R&D are at this moment chained to a Cell, porting Darwin...

    Simon
  • by Anonymous Coward on Sunday January 23, 2005 @01:52AM (#11446207)
    IBM -> PowerPC
    Apple -> PowerPC

    Cell -> PowerPC
    IBM, Sony -> Cell

    IBM, Apple -> Linux, BSD (Unix)

    Doesn't take a genius to come up with:
    IBM->Cell->Apple+Sony

    Sony makes the best computers, Sony makes one of the best gaming console.

    Although I'd rather see Apple join forces with Nintendo since these two companies are more alike than any other (quality over quantity).
  • Re:3 architectures (Score:4, Interesting)

    by SunFan ( 845761 ) on Sunday January 23, 2005 @01:53AM (#11446210)

    I don't think it does. Microsoft will be around for a while, unfortunately. In my sig, I expect Solaris, Mac OS, and Linux to be the top three of the UNIX side (not necessarily in that order). The BSDs are there for completeness, as they are good systems but are niche players. The main point behind my sig is that all the options listed are either cheaper/freer than Microsoft's options or just flat out better than Microsoft's options (or both). Microsoft really is in a precarious situation, where they have only inertia carrying them at the moment (granted, it's a lot of inertia but it's definitely finite).
  • Re:x86 (Score:5, Interesting)

    by goMac2500 ( 741295 ) on Sunday January 23, 2005 @03:14AM (#11446479)
    You realize you're talking about the company that had to cancel their P4 4.0 ghz, and is scrambling to just get to 64 bit. How are they supposed to be developing a competitor to the cell when there are behind in everywhere else? And guess what? IBM is co-creating the cell, and where are they going to use it? Workstations isn't it? Doesn't that mean... computers? Now why would they design the processor to run well in only games when they are going to use it in workstations? Not only that, but the Pentium 4 runs hot as hell. How do you suggest you're going to get 4 Pentium 4 cores in one chip, and then throw 4 of those in a machine without have major heat issues? I don't need to know what Intel is doing in their research department because they're already so far behind the game. Get back to me when Intel has a cool running 64 bit chip they can at least START WITH. AMD is in a much better position to go against Cell than them. There is a reason why Intel is out of the next gen game systems.
  • Re:x86 (Score:5, Interesting)

    by Screaming Lunatic ( 526975 ) on Sunday January 23, 2005 @03:33AM (#11446535) Homepage
    Only if it complies with x86. Seriously, x86 will be around for a century.

    That's ridiculous. x86 is dead. The overheating and power consumption confirms it.

    CISC hardware is horrible in mobile devices because of battery life and power consumption. Your camera, iPod, cell phone, and PDA do not use x86 hardware.

    All next generation consoles will use CISC hardware. Hence, economies of scale to get the price down.

    x86 is dead and mobile devices wrote the eulogy.

  • by (outer-limits) ( 309835 ) on Sunday January 23, 2005 @04:30AM (#11446641)
    Which is what this seems to resemble to me. http://vl.fmnet.info/transputer/
  • Re:Transmeta (Score:3, Interesting)

    by rossifer ( 581396 ) on Sunday January 23, 2005 @04:31AM (#11446647) Journal
    Interestingly (to me), the Pentium-M looks well on it's way to squashing the Pentium-4 market.

    Pentium-4 was an architectural mistake conceived with the goal of pushing the MHz numbers up (since the mass market appeared to trust MHz over "MHz-equivalent" labels). AMD astonished them by finally making their alternate naming scheme credible and the plan behind the P4 went straight down the crapper.

    New x86 development at Intel is largely derivative of the P3 core (the family that includes the P-M) and has largely deprecated the overheating/underperforming P4 core.

    Regards,
    Ross
  • by JQuick ( 411434 ) on Sunday January 23, 2005 @04:46AM (#11446669)
    The author had a good grasp of the high level architecture, but beyond that was clueless. His interpretation of the design is way off the mark.

    He seemed astonished by the 1024 bit wide data paths. The Power family is design with cache fill lines of 128 bytes. So, for instance the G5 L2 cache already does fetches 128 bytes into cache for each main memory read.

    Similarly all the talk about doing with cache and VM is bullshit. Instead of having each vector unit interfere with a shared cache as is done today, they've simply added smaller per ALU caches to the design, and complemented it with a device that is a souped up cache controller/MMU unit (the DMAC). The dmac apparently will be able to address both memory, and other hardware by having a virtual address layer, to enable reference to remote cell units as well as local physical hardware. The 64 MB of high speed rambus memory, may be all that is required for a PS3, but in a workstation implementation that memory is L3 cache.

    Altivec currently has 32 vector registers. Each ALU as 128. It it highly likely that the core opcode architecture will remain similar. The most likely addition will be to add a few flow control instructions to the existing mix.

    Altivec is already powerful but the biggest limiting factor is latency. Altivec can peform 1 instruction per clock on the G5, However the pipeline is 8 levels deep thus the overhead involved in fetching data, loading registers, performing a calculation among 1-3 registers, and getting a result is prohibitively expensive. However, if you can arrange to submit 8 calculations (or more) in rapid sequence, you can keep Altivac and the CPU busy and reap great benefits.

    The beauty of Cell will be in proving the ALUs with a bit more autonomy (thought not much more, they are still basically vector units), and enabling the main CPU to keep doing useful work while a number of ALUs are cranking away. Other novel design features provide for communication and synchronization with other units via remote addressing and timing (that's what those realtime clock signals are all about).

    This will be very fast, and very cheap. However, all the hand waving, and theorizing this guy does about both hardware and software reads like patent bullshit.
  • Cell IS POWER (Score:4, Interesting)

    by enkidu ( 13673 ) on Sunday January 23, 2005 @05:40AM (#11446785) Homepage Journal
    And if they[IBM] use this[Cell] in servers, they'd kill their POWER line.
    Did you read the article? The Cell architecture is what might have evolved if the multi-core POWER architecture continued for a couple of generations. Cell just skips those intermediate generations. Here's what the article says "The Cell architecture is essentially a general purpose PowerPC CPU with a set of 8 very high performance vector processors and a fast memory and I / O system, this is coupled with a very clever task distribution system which allows ad-hoc clusters to be set up.". Doesn't sound like IBM is afraid that Cell will kill their POWER line.
  • Re:x86 (Score:1, Interesting)

    by Aragorn992 ( 740050 ) on Sunday January 23, 2005 @05:53AM (#11446816)
    There is so many things wrong with your argument I wont comment on them all except one:

    "AMD is in a much better position to go against Cell than them. There is a reason why Intel is out of the next gen game".

    Do you have ANY idea how much resources Intel has? Not just money either, but production capacity as well.

    AMD is an annoying insect to Intel, albeit in recent times one with a bit of a sting.

    Btw im an AMD user and supporter.
  • by TheRaven64 ( 641858 ) on Sunday January 23, 2005 @06:24AM (#11446874) Journal
    Firstly, the apulets are unlikely to be both code and data. The APUs are vector processors designed to process streams of data, making it far more likely that you upload the code and the stream data to it (either directly or from another APU).

    Secondly, Darwin will not need porting to the Cell. It will almost certainly run with no modification on the PU. Things like QuickTime, Quartz and CoreVideo/Audio are likely to benefit by having components run on an APU, as might things like the network stack, but this is likely to be done over time rather than all for the initial release.

  • Re:x86 (Score:4, Interesting)

    by fish waffle ( 179067 ) on Sunday January 23, 2005 @08:46AM (#11447198)
    AMD is there with 64-bit, but what does that really buy you? More memory address space? How many people at home really want over 4 GB of RAM

    I've seen this naive opinion just too often to let another utterance of it escape unchallenged.

    64-bit does indeed offer more address space, which is an advantage to those needing more now/soon. But it has more important advantages; with a large, empty address space you can encode permissions, types and other info in pointers. You can pack or aggregate instructions/data. You can more easily/directly share an address space with everyone getting a large portion, or support novel/faster memory layouts by dividing the space into areas with different access permissions in the context of reasonable memory access strides. 32-bit constraints on such techniques make them less generally useful or excessively constrained, but in 64-bit (and above) they could become much more effective. Think of the ways people are proposing to use ipv6 addresses [though there are a few more orders of magnitude difference there] versus the ways people currently use ipv4---an increase in address space can be used for more than just more addresses.

    It may require some imagination to exploit it well, but it could have a much larger impact than you (and many others) think.
  • Re:Dataflow squared (Score:3, Interesting)

    by Syre ( 234917 ) on Sunday January 23, 2005 @09:37AM (#11447304)
    Here's an article [eetimes.com] that goes into some detail on the cell architecture and why it may not actually be as fast in practice it is in the glowing predictions made by Sony executives.

    The essential quote:
    UNC's Zimmons has his doubts. "I believe that while theoretically having a large number of transistors enables teraflops-class performance, the PS3 [Playstation 3] will not be able to deliver this kind of power to the consumer," he wrote in response to an e-mail query from EE Times. "The PS3 memory is rumored to be able to transfer around 100 Gbytes/second, which would mean it could process new data at roughly 25 Gflops (at 32 bits) -- far from the 1-Tflops number."
    I hope for great things, but will believe them when I see 'em.
  • by tempshill ( 413165 ) on Sunday January 23, 2005 @12:02PM (#11447908)
    If you pack more data in the pointers, you'll have applications that break in a few years when that extra address space is needed. Ask Apple what happened when they moved from the 68000 (32-bit addresses of which only 24 bits were used) to the Mac II's 68020 (32-bit addressing). Four Macs (the II, IIx, IIcx, and SE/30) actually had versions of QuickDraw in ROM using the top byte to pack extra data into the pointers, as you recommend, and then Apple had to patch the entire QuickDraw package in RAM to code around this. Untold numbers of apps broke also of course.
  • by Glock27 ( 446276 ) on Sunday January 23, 2005 @12:49PM (#11448185)
    This article [semireporter.com] has some interesting and somewhat current information.

    Looks like pilot production should begin soon on a 90 nm. process similar to that used for current Athlon 64s and Opterons. No word in this article on initial clock speeds and power dissipation.

    Anyone have additional info?

    BTW, another article I hadn't seen linked [com.com] claims that Cell will be relatively easy to program...seems that Sony learned from some of its PS2 mistakes. That contradicts a lot of the threads responding to the original article and this dupe.

  • by Animats ( 122034 ) on Sunday January 23, 2005 @01:59PM (#11448580) Homepage
    That's exactly right. Despite all the hype, this is basically a new generation of the PS2 architecture. There's a conventional CPU and a number of dataflow vector units. The dataflow units have a small amount of fast local memory and access to main memory. Just like the PS2. This time around, everything is bigger and better, and there's more of everthing, but it's the same idea.

    The PS2 was revolutionary, in that it was the first successful non von Neumann machine. There have been many exotic architectures along these lines, from the Illiac III to the Transputer to the nCube to the Connection Machine, but they've all been failures in the marketplace. The PS2 sold in volume and made money. That was enough to get people to develop techniques for programming dataflow machines, which aren't fun to program. Working out those problems delayed games for the PS2 by a year or two, but now it's been figured out.

    Now that the techniques have been worked out, at least within the game development community, a new generation of the same approach makes sense. Especially for graphics, which parallelizes well. You can keep throwing hardware at graphics until you get to one processor per pixel per triangle, and still get performance improvements.

    Note the limitations. Each vector processor has only 128K (not MB) of local memory. This is like DSP programming; you don't have much local storage. There's access to main memory, but it will stall the vector processor, so you can't overdo it. Bashing your problem into chunks that fit that constraint is a major hassle.

  • Re:Dataflow squared (Score:1, Interesting)

    by Anonymous Coward on Sunday January 23, 2005 @02:23PM (#11448716)
    Then again, in PS2 already, the goal (with the creation of pretty game worlds) was to *generate* scenery through heavy procedural computations, not read it off a texture in video memory. It actually sucks on texturing, and not by accident.

    The (memory) bottleneck you point out only really matters in streamed data. But I'd bet real money that the PS3 design team hopes to see game engines that "emerge world" by massive iterative calculations on rather compressed primitives of data (NURBS-like "infinitely accurate" geometry, shader-like procedures mixed and mangled, some kind of low-level matter descriptions used for visuals and physics, et cetera). And games do heavy math on AI, physics, 3D sound generation -- these aren't necessarily bandwidth intensive at all.

    Computationally games are (or can be) very unlike video editing or other bandwidth heavy stuff. And Cell/PS3 is all about vector calculations, really. The bandwidth they tout is mostly just an inevitable side effect, not an outstanding strong point.

    But as somebody pointed out already, whether PS3 is going to be a giant PITA to program for will depend on the tools that Sony (or IBM) offer for it. Then again, with the multi-year lifespan of a console, it doesn't really hurt if some of the power is harnessed even a coupla years after the first wawe of game titles (at launch).

So you think that money is the root of all evil. Have you ever asked what is the root of money? -- Ayn Rand

Working...