
A Look Into The Cell Architecture 318
ball-lightning writes "This article attempts to decipher the patent filed by the STI group (IBM, Sony, and Toshiba) on their upcoming Cell technology (most notably going to be used in the PS3). If it's as good as this article claims, the Cell chip could eventually take over the PC market."
Dupe! (Score:4, Insightful)
Timothy do you actually read Slashdot?
Dupe!-Was it as good for you? (Score:5, Insightful)
Here's a better question. If he will not, why should we?
Re:Dupe! (Score:2)
Re:Dupe! (Score:5, Insightful)
Wouldn't that be like eating from the toilet?
Re:Dupe! (Score:2)
Mr. Pot, meet Mr. Kettle. Yes, Mr. Pot, he is rather dark in hue isn't he?
Re:Dupe! (Score:3, Insightful)
he's buying the sony propaganda on full throttle, probably wasn't around couple of years when they did the EXACT same thing with ps2 - overhyping it to the max.
it's not some revolution chip that will give you a desktop with 4x the power for cheapo cheap..
Re:Dupe! (Score:2)
A Look Into The Dupe Architecture (Score:2, Funny)
x86 (Score:4, Insightful)
Re:x86 (Score:2, Insightful)
Treacherous Computing (Score:2, Insightful)
If, for example, HDTV set-top boxes supported email, Word, and spreadsheets, it'd happen pretty quickly.
I'm not buying a console-style computer until it supports GCC out of the box. I want the freedom to compile my own software for a given machine and distribute it without having to go through a console maker that refuses to even talk to individual developers and smaller firms.
Re:Treacherous Computing (Score:2)
Re:x86 (Score:2)
No kidding (Score:2)
No kidding. That's exactly what I was thinking. x86 has been completely supplanted in the console market, with no share whatsoever. This encourages game developers to the PowerPC playform.
PS3 is Cell (PowerPC based), XBox is PowerPC 970 b
Re:No kidding (Score:2)
1) Game developers would already be using PowerPC systems to develop their games for all three consoles
2) The Macs would be far more powerful than their x86 counterparts, allowing for much higher-powered software and games
3) Linux runs very well on powerpc...
Re:No kidding (Score:2)
Re:No kidding (Score:2)
Since when ?
Windows for Alpha & PPC are defunct
Where is 64bit Windows ?
The only major technological changes I can remember the Windows line dealing with are the switches from to 8086 standard mode to 286 real mode and then 32bit 386-enhanced mode.
Re:x86 (Score:5, Interesting)
Cell IS POWER (Score:4, Interesting)
Re:x86 (Score:4, Interesting)
I've seen this naive opinion just too often to let another utterance of it escape unchallenged.
64-bit does indeed offer more address space, which is an advantage to those needing more now/soon. But it has more important advantages; with a large, empty address space you can encode permissions, types and other info in pointers. You can pack or aggregate instructions/data. You can more easily/directly share an address space with everyone getting a large portion, or support novel/faster memory layouts by dividing the space into areas with different access permissions in the context of reasonable memory access strides. 32-bit constraints on such techniques make them less generally useful or excessively constrained, but in 64-bit (and above) they could become much more effective. Think of the ways people are proposing to use ipv6 addresses [though there are a few more orders of magnitude difference there] versus the ways people currently use ipv4---an increase in address space can be used for more than just more addresses.
It may require some imagination to exploit it well, but it could have a much larger impact than you (and many others) think.
Re:x86 (Score:2)
64-bit does indeed offer more address space, which is an advantage to those needing more now/soon. But it has more important advantages; ... You can more easily/directly share an address space with everyone getting a large portion.
Indeed. Do you know why when you start a program you have to wait several (or maybe even 10 or 15) seconds before it is ready for you
You shouldn't do most of this. (Score:3, Interesting)
Re:x86 (Score:3, Funny)
Hell, if Intel's processors get any warmer, I'm going to get the gas cut off and let the computer warm the house.
We need to advance to 64, or 128 bit technology to be able to keep up with other technologies. Cell seems like a logical next step after reading this post a few
Re:x86 (Score:5, Interesting)
That's ridiculous. x86 is dead. The overheating and power consumption confirms it.
CISC hardware is horrible in mobile devices because of battery life and power consumption. Your camera, iPod, cell phone, and PDA do not use x86 hardware.
All next generation consoles will use CISC hardware. Hence, economies of scale to get the price down.
x86 is dead and mobile devices wrote the eulogy.
Re:x86 (Score:3, Funny)
Until my mobile devices can play Wing Commander, you're full of shit.
Re:x86 (Score:2)
No, that would be C86. X is 10.
Re:x86 (Score:2)
Re:x86 (Score:2)
Its a dupe (Score:3, Insightful)
Maybe slashcode should have a link repository, if someone adds a new story with a link, they get a warning another story pointing to the same link was posted 18 hours ago...
We've even seen triple-dupes.
Re:Its a dupe (Score:2)
Re:Its a dupe (Score:2)
--jeff++
Re:Its a dupe (Score:3, Funny)
Re:Its a dupe (Score:2)
Re:Its a dupe (Score:2)
Are you serious? I have been reading slashdot since 1997 or some time around there, and I can tell you that any good suggestions ever made, such as the one in your post, will *never* get implemented.
Slashdot started with a very very good seed of an idea about a quasi-community news amalgamation site - but since its inception has has proved, beyond a shadow of a doubt how lazy the founders actually are.
They have had opportunity upon opportunity to build upon this site
Re:Its a dupe (Score:2)
Who said anything about pistols at 10 paces? =)
Re:Its a dupe (Score:3, Interesting)
Looks like we need to throw all computers out (Score:2, Insightful)
Re:Looks like we need to throw all computers out (Score:5, Interesting)
He seemed astonished by the 1024 bit wide data paths. The Power family is design with cache fill lines of 128 bytes. So, for instance the G5 L2 cache already does fetches 128 bytes into cache for each main memory read.
Similarly all the talk about doing with cache and VM is bullshit. Instead of having each vector unit interfere with a shared cache as is done today, they've simply added smaller per ALU caches to the design, and complemented it with a device that is a souped up cache controller/MMU unit (the DMAC). The dmac apparently will be able to address both memory, and other hardware by having a virtual address layer, to enable reference to remote cell units as well as local physical hardware. The 64 MB of high speed rambus memory, may be all that is required for a PS3, but in a workstation implementation that memory is L3 cache.
Altivec currently has 32 vector registers. Each ALU as 128. It it highly likely that the core opcode architecture will remain similar. The most likely addition will be to add a few flow control instructions to the existing mix.
Altivec is already powerful but the biggest limiting factor is latency. Altivec can peform 1 instruction per clock on the G5, However the pipeline is 8 levels deep thus the overhead involved in fetching data, loading registers, performing a calculation among 1-3 registers, and getting a result is prohibitively expensive. However, if you can arrange to submit 8 calculations (or more) in rapid sequence, you can keep Altivac and the CPU busy and reap great benefits.
The beauty of Cell will be in proving the ALUs with a bit more autonomy (thought not much more, they are still basically vector units), and enabling the main CPU to keep doing useful work while a number of ALUs are cranking away. Other novel design features provide for communication and synchronization with other units via remote addressing and timing (that's what those realtime clock signals are all about).
This will be very fast, and very cheap. However, all the hand waving, and theorizing this guy does about both hardware and software reads like patent bullshit.
This is what happens... (Score:2, Redundant)
As someone posted above, it seems like it would be fairly trivial to at least make a "dupe check" program that tells you whether you have linked to the same URL before...
Re: (Score:2)
Re:This is what happens... (Score:2)
Dataflow squared (Score:5, Interesting)
The original PS2 design was for a dataflow architecture - the Cell is a continuation (and significant evolution) of the theme. Interestingly enough, if this *does* take off it may be that the best programmers of tomorrow turn out to be the PS2 low-level guys, who've already written the algorithms that are about to be important.
In the PS2, the MIPS chip was there mainly to do the simple stuff, all the heavy lifting was done on the 2 vector processors, and they were designed to have programs uploaded into them and data streamed through them using a very flexible (chainable) DMA engine. Sounds similar (if in a limited sense) to the Cell chip itself.
Simon.
They reinvented The Amiga! (Score:5, Interesting)
A measly MIPS with hardware that is autonomous.
The only thing they need is to sync to the TV set.
Re:They reinvented The Amiga! (Score:3, Insightful)
Re:Dataflow squared (Score:3, Interesting)
The essential quote:
Yes, it's basically an improved PS2 (Score:3, Interesting)
The PS2 was revolutionary, in that it was the first successful non von Neumann machine. There have been many exotic architec
Transmeta (Score:4, Insightful)
This is a distributed-processing-capable chip. They're moving software into the chip, doing what software can do in a more compact and probably more efficient way. There's nothing revolutionary here and besides being a dupe story it's way overrated. The only attractive here is the fact PS3 will use it instead of embedding something open, like Mosix.
And no it won't "eventually take over the PC market."
Re:Transmeta (Score:3, Interesting)
There is a _lot_ of revolutionary ideas behind the Cell processor. As shown in the write-up, the Cell takes a drastic change from the conventional arithmetic-unit/cache setup. Additionally, the way the Cell can pipeline parallelizable problems amongst the 8 processing units within itself is a revolution of chip design already. Take, for example, the video encoding/decoding example shown in the write-up, whereas an an Intel chip will require processing of each procedure
Re:Transmeta (Score:2, Insightful)
The only attractive here is the fact PS3 will use it instead of embedding something open, like Mosix
I'm not sure if you're praising or knocking Mosix (or more accurately, OpenMosix), but the method by which OpenMosix migrates processes bears very little resemblance to Cell. OpenMosix's redeeming quality is binary compatibility with most, if not all, existing software written for whatever architecture the cluster is running on. Cell resembles MPI more than Mosix, by far, in that software will have to be
Re:Transmeta (Score:3, Insightful)
Re:Transmeta (Score:3, Interesting)
Pentium-4 was an architectural mistake conceived with the goal of pushing the MHz numbers up (since the mass market appeared to trust MHz over "MHz-equivalent" labels). AMD astonished them by finally making their alternate naming scheme credible and the plan behind the P4 went straight down the crapper.
New x86 development at Intel is largely derivative of the P3 core (the family that includes the P-M) and has la
Re:Transmeta (Score:2)
Re:Transmeta (Score:2)
Within a flexibile timeframe, there is an inflexible truth -- there will be no popular analog TV broadcasts in the US. This cannot happen without the technologies in place for digital to replace analog TV. The market is, what, a couple of hundred million sets in the US, and a billion or so world-wide, eventually?
Do you think the
There are always critical sections (Score:5, Interesting)
There will always be "critical sections", data which can only be used by 1 thread at a time, which limits how much it can be split up.. Then you have programs which cant be.. I mean, you can split up a game for instance into a sound, video, and keyboard threads easily. To really utilise parallel processing takes a massive amount of code, which with current languages, seems to make it a bit implausible to get a massive increase.
It should also be remembered that the G5's and G4's already have altivec, and even though this is on a much grander scale, there will always be bottlenecks that slow it down preventing 99% of commonly used apps from getting a significantly large increase..
Consider a different approach (Score:5, Informative)
All the programs that run on PC architectures expect certain things to be in place - they expect a single fast central CPU. They expect that good cache usage is important for performance. They expect to have access to gobs of RAM. Etc. Etc. The PS2 (and by extension the cell) is completely different.
Consider a different architecture. You have a job that consists of multiple things to do. Some of these can be easily parallelised, others are mainly sequential. Divide it up so the parallel ones are coded separately, maybe with some IPC to synchronise to some clock.
For a sequential part (say rendering the object list of a scene back to front to gain occlusion) the approach that worked for me on the PS2 (which is logically similar, if significantly less powerful) was to divide the job into tasks. Each task (say, one per object in the above) gets its own bit of code and knows about the data that it needs to perform its task.
The key thing is that the Harvard separation of code and data just isn't, on a PS2. You set up a DMA chain that loads the program into the processor, then streams the data through the program on the processor, lather, rinse, repeat. Make the chain self-submitting and you can effectively forget about that chunk of code now, it'll just happen.
This is still doing things sequentially (but we've agreed that this is a sequential task, right?) - the point is that it's being done highly efficiently within the architectural constraints. You have a dataflow architecture and even sequential code can hit the performance limits if you code to the architecture.
The Cell looks even more powerful, in that you can chain execution modules together, so you can load code into APU's 1,2,3,4 and stream the data through 1,2,3,4 automatically before it's considered 'done'. This was possible on the PS2, but
Simon
Re:Consider a different approach (Score:2)
Re:Consider a different approach (Score:3, Interesting)
I *think* the programming model will be sort-of-like CORBA, with 'messages' being sent from a central despatcher (the G5 probably, though it could be another APU). I think the messages will be self-contained program+data though - they've even called them APUlet's. The OS then schedules them to be executed on the first available APU.
The message is the data, but the code will be bundled along with it, and when it's finished, it'll send another message back to the despatcher
Re:Consider a different approach (Score:3, Interesting)
Secondly, Darwin will not need porting to the Cell. It will almost certainly run with no modification on the PU. Things like QuickTime, Quartz and CoreVideo/Audio are likely to benefit by having components run on an APU, as might things like the network stack, bu
Re:There are always critical sections (Score:2, Insightful)
Not really. Current gaming computers are usually bogged down while trying to display a graphical-intense game. Home electronics are composed of video and audio. Much of 2D and 3D visualization and audio are "embarrassing parallel problems". Take the video encoding/decoding example from the article, you don't need to parallelize a video frame in terms of each pixel elements, instead, one opts to parallelize each video encoding process t
Re:There are always critical sections (Score:2)
Timothy, Saturday night (Score:5, Funny)
Re:Timothy, Saturday night (Score:2)
Re:Timothy, Saturday night (Score:2)
Some Thoughts (Score:5, Insightful)
First of all I want to say I think it is completly possible to make a processor with 8APUs and so forth. For starters PowerPC chips already have several seperate execution units on them, and I think they use fewer transitors than intel chips. Moreover, a huge chunk of the transitor budget goes to doing things like cache consistancy or complicated instruction prediction which is probably not used on the much simpler APUs.
Of course it seems like this is primarily of interest to game systems or signal processing applications (note that a 4 threaded 32 stream processors is just another way of saying 4 cell procesors, each has a PPC core with 8 APUs). However, I would not be so quick to dismiss this for the PC market. While it may be true that many individual applications may not easily multi-thread it seems we are approaching a point where the biggest complaint is not the maximum processing rate in one application but the ability to run multiple applications at once. On my computers I'm rarely if ever frustrated at the rate some program is running at, but slowdown in other programs when I run a processor intensive job or turn on a video. So while drawing a webpage may not be speed up by this processor drawing several webpages at the same time will be and that is the sort of thing which makes a big difference for the end user.
Also, a processor like this offers great possibilities for JIT and VM code. The main thread can dispatch instructions and threads to the APUs dynamically based on what is happening in the system. Also I find it interesting that IBM is going the same way as intel in pushing all the complexity on the compiler. It makes one wonder if itanium is really as dead as everyone thinks. Perhaps in 4 years when AMD can't squeeze anything more out of x86 intel will be ready to jump in having worked out all the bugs to their new chip.
Re:Some Thoughts (Score:2)
Re:Some Thoughts (Score:2)
Re:Some Thoughts (Score:2)
I think that the author "knocks" current CPU architecture entirely too much (both PPC and x86) with the comment that the vector units on these chips aren't dedicated enough. While somewhat true - it's also misleading. Typical application code isn't terribly suited to vector processing. Pushing pixels, and decompressing and compressing video and audio - sure. Word processing code, not so much so.
Of more possible interest than pushing the complexity to the compil
Re:Some Thoughts (Score:2, Informative)
Check.
Multiple function units on a chip is not the same thing as the 8 APUs of the Cell. First off, there's no indication whatsoever that this is a single-chip architecture. Even if it is a single chip solution, the coupling of a superscalar's function units
Re:Some Thoughts (Score:2)
As for the quesiton of eliminating the cache I thought this was true in name only. Isn't that 8K or whatever each APU is claimed to have access to supposed to be on chip?
Dupe posting was worth it for this comment: (Score:2)
Ahahahahaha!
Merrimack streaming processor is like CELL (Score:3, Informative)
It's so similar that you wonder if they lifted it from him. The only difference is that Prof. Dally's chip has a big cache.
Cell architecture? (Score:2)
What I can't help but think (Score:5, Interesting)
This is, of course, all just conjecture.
But when I begin to see people seriously talking about the chip from the Playstation 3 eventually potentially being used in PC hardware, I begin to wonder if it's maybe reasonable conjecture...
Re:What I can't help but think (Score:2, Interesting)
Apple -> PowerPC
Cell -> PowerPC
IBM, Sony -> Cell
IBM, Apple -> Linux, BSD (Unix)
Doesn't take a genius to come up with:
IBM->Cell->Apple+Sony
Sony makes the best computers, Sony makes one of the best gaming console.
Although I'd rather see Apple join forces with Nintendo since these two companies are more alike than any other (quality over quantity).
Re:What I can't help but think (Score:2)
What MS really worries about, and what you got at least somewhat right, is the "Media Center" idea. Even
Re:What I can't help but think (Score:2)
3 architectures (Score:5, Interesting)
It's been said before, but mature industries tend towards three of something, such as GM-Ford-Chrysler. For CPUs, it has to be AMD64/ia32e, PowerPC, and SPARC. They're the only ones with any high-volume prospects. SPARC will certainly be in third place, with AMD64/ia32e and PowerPC duking it out for one and two. The fact of the matter is that Itanium won't be a mainstream processor, and PA-RISC, Alpha, and MIPS are all more-or-less EOL.
For operating systems it will still be Windows, Linux, and UNIX (predominately Mac OS and Solaris). Okay, that's four, but the other historical major players are all becoming niche legacy platforms.
For office suites, it'll be MS Office, StarOffice/OpenOffice.org, and iWork. The others are all niche players.
For browsers it'll be IE, Firefox, and Safari.
At least this will tend to simplify some things, because the non-Microsoft platforms will be fewer making supporting them easier. This is a good thing, IMO.
Re:3 architectures (Score:5, Funny)
Re:3 architectures (Score:4, Interesting)
I don't think it does. Microsoft will be around for a while, unfortunately. In my sig, I expect Solaris, Mac OS, and Linux to be the top three of the UNIX side (not necessarily in that order). The BSDs are there for completeness, as they are good systems but are niche players. The main point behind my sig is that all the options listed are either cheaper/freer than Microsoft's options or just flat out better than Microsoft's options (or both). Microsoft really is in a precarious situation, where they have only inertia carrying them at the moment (granted, it's a lot of inertia but it's definitely finite).
Re:3 architectures (Score:2)
It is finite of course but it isn't fixed: it increase each time someone creates a Word document, an IE only webpage, a HW device which works only with Windows, etc..
And it decrease each time someone use open standards or use MacOS X..
Re:3 architectures (Score:2)
That example is a little antiquated. So which are the three car makers now?
It seems to me that the number of players varies for every industry. After all, wasn't "one" the number of major players in the PC OS business?
-a
Re:3 architectures (Score:2)
It's been said before, but mature industries tend towards three of something, such as GM-Ford-Chrysler.
And what about Toyota, Hyundai and VW? You have a very US-centric view here.
For CPUs, it has to be AMD64/ia32e, PowerPC, and SPARC. They're the only ones with any high-volume prospects.
I don't have any links to prove it, but I am fairly certain that in the last few years, there have been sold more ARM-based CPUs than those three architectures combined.
I think you oversimplify things a bit with t
Bread and circuses... (Score:2)
Talk is cheap, and hollow hype is worthless. (Score:3, Insightful)
And if I had 4 legs, I could outrun a dog.
But I don't, so I can't. And this chip won't be as good as the (overenthusiastic) article claims. It won't take over the PC market.
This chip will take over the PC market the same way that BitBoys took over the graphics card market; the same way that Transmeta took over the mobile CPU market; the same way that the Elbrus 2k took over the desktop CPU market. That way is: deliver endless hype that you can't possibly back up. By the time it hits the market, the hype will be so built up that people won't be able to help but to feel let down by the chip. Then they'll lose interest in the product.
This chip might be fast for the money, and enable them to put 4 cores in a consumer device like the Playstation, but it's not going to outperform (or even match) a CPU like the P4 or Athlon 64.
When will people learn to stop falling for the same tricks?
Steve Jobs, Vectors and OS X (Score:2, Insightful)
He also sold tens of thousands of these boxes to a government agency who's name is Not Said Aloud. Seems their early APU-like design was very good at some important things.
Cells are the Next big thing. PS3 will indeed kick ass - real time virtual video - and so will fut
Re:Steve Jobs, Vectors and OS X (Score:3, Informative)
Wrong. Jobs hired the guy who produced the Mach operating system at Carnegie Mellon, Avie Tevanian [apple.com].
Sounds like BS (Score:2)
while a PS3 sits in the background churning through a SETI@home [SETI] unit every 5 minutes.
The Cell is designed to fit into everything from PDAs...
Let's ignore the obviously ridiculous claim that supercomputer-scale computing power is coming to my home in the next year or two and think about power consumption. How is this uber-CPU going to get enough battery power in a PDA?
No one has mentioned the Transputer (Score:3, Interesting)
Cray-4 On a Chip (Score:2)
4 Cells? (Score:2, Insightful)
Semiconductor Reporter article... (Score:4, Interesting)
Looks like pilot production should begin soon on a 90 nm. process similar to that used for current Athlon 64s and Opterons. No word in this article on initial clock speeds and power dissipation.
Anyone have additional info?
BTW, another article I hadn't seen linked [com.com] claims that Cell will be relatively easy to program...seems that Sony learned from some of its PS2 mistakes. That contradicts a lot of the threads responding to the original article and this dupe.
Re:Thats bull (Score:2)
No, it's not bull. (Score:2)
Simon.
Re:slashdot editors thought this article... (Score:2)
As I've suggested on previous occasions, it'd be better to start up a pool as to when it's posted again, and who posts it. My guess is Cmdr Taco, next Tuesday.
Odds are pretty long that Timothy would post it again, but never say never. =)
Re:Well, this could use some more reiteration... (Score:5, Informative)
Paper Details:
Re:So what really do we have here? (Score:2)
That's kind of going to become a problem though if they're seriously expecting the Cell to be used outside the EVERYTHING-MUST-BE-OPTIMIZED world of video games. I mean, it seems like one of the big contributing factors in the death of the Itanium was that the hardware was so batshit bizarre that compilers co
Re:So what really do we have here? (Score:2)
Re:Dupe alert (Score:2)
You're right. (Score:2)
While I don't think the author is a dickhead, he makes crazy assumptions and fills his editorial with personal opinions rendering this "article" nothing more then a weblog.
Everyone else around here is either bitching about the post being a dupe (without realizing the fact that their own posts are dupes of all the other losers that are pointing out the dupe) or reading this thing like it's fact.
While I'm sure these new CPU's will be novel, and they might be fast at some oper
Re:Not Again! (Score:2, Insightful)
For my rebuttal,
AMD is better yet cheaper, Linux i