Design Philosophy of the IBM PowerPC 970 232
D.J. Hodge writes "Ars Technica has a very detailed article on the PowerPC 970 up that places the CPU in relation to other desktop CPU offerings, including the G4 and the P4. I think this gets at what IBM is doing: 'If the P4 takes a narrow and deep approach to performance and the G4e takes a wide and shallow approach, the 970's approach could be characterized as wide and deep. In other words, the 970 wants to have it both ways: an extremely wide execution core and a 16-stage (integer) pipeline that, while not as deep as the P4's, is nonetheless built for speed.'"
What is: 2H03? (Score:2, Interesting)
Question (Score:5, Interesting)
Why wouldn't Apple go with the Power4 over the PPC970? And I already know that nothing official has been announced by Apple and that this is all probably going to be a lot of sturm und drang signifying nothing, but that's what keeps us Mac guys going I guess.
$$$/performance (Score:4, Interesting)
I recall IBM's PPC boards going for over a grand, which is (to me) far too much. Especially when it was a 'G3' chip.
Even if the new chip is faster, will I be able to buy 2 pentium 4's (5?) for the price of it?
Comparison without AMD? (Score:4, Interesting)
Whoa (Score:5, Interesting)
Before all my fellow Mac users start A) thinking about going to Linux B) drooling C) wondering about Darwin or D) some combination of the above, let me remind you that Darwin scales very well. You can now return to your previous state of awe.
PS - How much you want to bet good ol Steve is already having wet dreams about doing the traditional Photoshop test at a Macworld with 4-way SMP?
Wide and Deep - Hail Mary pass perhaps? (Score:2, Interesting)
Die size as an indicator of cost (Score:2, Interesting)
Re:They wouldn't have to redo anything... (Score:2, Interesting)
Re:I am tired of 64 bit lies ! (40 bit only, folks (Score:3, Interesting)
I dunno 'bout Macs (I don't know the M68k's "bitness"), but Intel introduced the 386 (their first 32-bit CPU) in 1986. And I certainly don't think the M68k was a RISC processor.
at current prices and projected prices, 512 gigabytes or RAM will barely cost more than a couple of the fastest processors of this type.
Really? I would LOVE to be able to buy 512 gigabytes of RAM for the cost of a couple of fast desktop processors. Don't forget that the PowerPC 970 is meant to be a desktop processor.
Re:wide / transistors (Score:5, Interesting)
And VLIW (even EPIC) requires the compiler to adapt the code to the hardware. The 970 uses hardware to adapt the software to match it's needs. A next generation version could have 10 units, and the software wouldn't be optimized for 3 instruction bundles (like EPIC), etc....
Also, while there are more int units in the P4, my understanding is they are not all full units. And in the 970 case, you were ignoring that the vector unit is a better int unit for most things than the Int unit (and it has 2 more ALU's, that are actually up to 16 separate ALU's each).
The point is that if I'm doing int calculations or moves, then the AltiVec unit is much better than the Int units in the P4. So the only thing the 2 int units in the 970 should be used for is address calculations, branches and a few scalars that don't fit the vectors.
Re:They should make it work three ways (Score:3, Interesting)
Because that is where most of the desktop CPU money is going, some of the high end, and frighteningly enough a fair bit of embedded CPU money too.
In short if you can navigate the patent mine field, the brutal competition mine field, and deal with the instruction set making things a royal bitch doing an x86 CPU is a total no-brainer.
Other then needing a whole new decoding front end, and being forced to use a trace cache because decoding multiple instructions in x86 land is very hard... the instruction thing isn't a big deal. Handling the odd-ball 80 bit FP format is. So is emulating all of the trap stuff and the other little odd bits close to the instructions set (like the MMU).
A big pain. But with much of the effort not being where folks think it is!
Speculate... (Score:3, Interesting)
Re:very deep pipelines (Score:3, Interesting)
(which becomes a more and more significant overall factor as k and f go up...)
i think i saw at least one cpu where latch delay per stage was equivalent to stage-execute time.
the biggest reason im skeptical of deep pipelines is that they suck unless the instruction mix is hand picked. in nature, 30% of instructions are branches. compilers can only do so much to lengthen the basic block, and predictors can only be right "most" of the time.
Re:Question IT IS ONLY 40 BITS not 64. (Score:5, Interesting)
Your desire to use address pins (or is it max pinned space per process?) to measure size puts you in a distinct minority. That doesn't make you wrong. But neither does it help make you right in this particular jungle.
Systems whose physical addressing match their claimed "bitness" are probably in the minority.
Some systems provide more physical addressing than register width (later PDP-11s, 8086, S/390), some less (68000, classic CDCs, early POWER). The 970 falls into the less category. Nothing unusual there.
Apple, like EVERY OTHER OS KNOWN, will steal a bit or two
Some bits come from physical addresses, some from virtual addresses. These should be addressed [pun slipped in, sorry] separately. AIX, btw, steals less than one bit. Linux can also be configured to steal less than one bit. (Assertions I can get away with no loss of credibility, since AC's have none to start with.) Were you frightened by a VAX in your formative years?
Why do fanboys mod stuff like this down?
Because we can't figure out why someone who needs 512GB, or 1TB, or more (which is it?) cares that a Linux process is limited to 1GB and not 2GB or 4GB.
Re:All this talk... (Score:2, Interesting)
0. When I say I want openMP added to gcc, that sort of implies that I want the compiler directives added, and the library routines added to the standard libraries that are shipped with the compiler. I realize gcc isn't going to affect any environment variables.
1. I write scientific code. That's how I know openmp. I think it's great for that.
2. It's not really overkill, because it's quite easy to program for (in a portable way!) Much easier than fork(), and IMHO easier than pthreads.
3. Admittedly, I've never needed much in the way of IPC for the codes I've written for smp (scientific codes on smp machines don't need much of it) but you could probably use a pipe, a socket, or whatever else. I do mostly want it for the scientific uses, though.
4. MPI is good. But openMP is like 1000x easier to write for, and on good hardware is usually better ... for my problems.
5. If IBM, or apple even, makes affordable, good smp boxes with this processor, openmp would be quite useful.
6. The same features it's good at in science ought to make it perfect for other processor intensive tasks. Anything that needs a for (do) loop can be scaled quite well. Anything that has chunks that don't need communication can be scaled well as well. I imagine video/sound encoding might be easily parallelized using this...much easier than pthreads, and for this mpi would be overkill, don't you think?
7. All I'm saying is that openMP is out there, is supported on most commercial compilers, and it's noticably missing from gcc. *I* would use it. I suspect many more people would use it if it was available .... especially if 4 processor boxes become available on the commodity market as people seem to be expecting (dreaming) in this topic.
Re:unfotunatly Apple is going with Intel instead.. (Score:1, Interesting)
Intel seems happy to sit out this round, I'm wondering if they are going to be there next round though.