AMD Unveils Barcelona Quad-Core Details 206
mikemuch writes, "At today's Microprocessor Forum, Intel's Ben Sander laid out architecture details of the number-two CPU maker's upcoming quad-core Opterons. The processors will feature sped-up floating-point operations, improvements to IPC, more memory bandwidth, and improved power management. In his analysis on ExtremeTech, Loyd Case considers that the shift isn't as major as Intel's move from NetBurst to Core 2, but AMD claims that its quad core is true quad core, while Intel's is two dual-cores grafted together."
Re:Memory Controllers (Score:3, Informative)
Re:Once again... (Score:4, Informative)
Hmmmm Wrong. (Score:3, Informative)
Re:Note to AMD: We don't care (Score:5, Informative)
AMD: a shared L3 feeding core-specific L2 caches. Intel: each core-pair sharing a L2 cache. AMD's approach better avoids threads competing for the same data (thanks to copying it from L3 to every L2 that needs it), while keeping access latencies more uniform and predictable (thus better optimizable).
Other AMD enhancements look more like catch-up to Core 2: SSE [and it's "Extensions", dammit, not "Enhancements"] paths from 64bit to 128bit, more advanced memory handling (out-of-order loads versus Intel's disambiguation et al.), more instructions per clock by beefier decoding (more x86 ops through fast path instead of microcode) and more "free" ops (where Intel added way more discrete execution units from Core to Core 2).
If AMD's quad manages to be better due to better memory bandwidth and latency (in practice), then they were quite right about "true quad-core"
Re:Socket consideration (Score:5, Informative)
The closest to a solution we have would be going back to Pentium 2/3 style processor-on-a-card designs which would move the memory slots to an expansion card shared with the processor which would then have a HyperTransport interface to the motherboard.
This works, as some motherboard manufacturers (ASRock on the 939DUAL for one) have implemented something along these lines for AM2 expandability. The problem lies in laying out the circuitry for this new slot, not to mention the incompatibility with many of the large coolers we often use today. It also would become even more complex when faced with another one or two extra HyperTransport lanes as found on Opteron 2xx and 8xx chips, respectively.
AMD made a compromise when they designed K8. On the one hand, the on-die memory controller improves latency by a huge amount and scales much better by completely eliminating the memory and FSB bottlenecks that Intel chips get in a multiprocessor environment. On the other hand, new memory interface = new socket, no way around it.
From what I understand, the upcoming Socket F Opterons will have over 1200 pins in their socket so as to allow both a direct DDR2 interface and FB-DIMM. If I understand FB-DIMM technology correctly, it should end this issue by providing a standard interface to the DIMM which is then translated for whatever type of memory is in use. Logically this will trickle down to the consumers in another generation. For the time being however, AMD has stated that the upcoming "AM3" processors will still work in AM2 motherboards, as they will have both DDR2 and DDR3 controllers.
Re:Quad-core vs. dual-dual-core? (Score:5, Informative)
Re:Once again... (Score:4, Informative)
Now Intel has out-benchmarked AMD, and is attempting to change the rules again to performance-per-watt. This next wave should be interesting to watch.
I guess you won't buy Intel either... (Score:5, Informative)
I suppose that means you won't buy an Intel chip either. Look at what happened with Conroe. Core 2 Duo uses a socket with the same name as the P4 socket, the same number of pins too. But guess what? When Conroe came out there were less than a handful of reasonable boards out of the hundreds of models out, that would actually support it. The voltage requirements changed slightly, the BIOS requirements changed, and the end result was that upgrading to Conroe on a given board was hit or miss. I fail to see how Intel's MB upgrade situation is any better than AMD's. It sounds to me like you're falling for Intel's game: "We kept the socket name and number of pins the same, so that means we have better socket longevity." Sorry, but I'm not falling for it. I've read too many horror stories on the forums from Conroe upgraders that thought they could use their current P4 boards.
Don't get me started on Intel's TDP scam either (AMD's = max, Intel's = average). AMD may not always have the best tech, but I find them to be a much more straight-forward company, with fewer sneaky games designed to trick customers.
And why are we posting a story about AMD's tech said/written by an Intel employee? Sounds like it was biased before it even started to me.
Re:Quad-core vs. dual-dual-core? (Score:1, Informative)
They claim that this improves performance with virtualization
From the article:
Barcelona uses a three-stage cache architecture. The L1 cache is 64KB, the L2 cache is 512KB and the L3 cache is 2MB. The L1 and L2 caches are dedicated to a particular core, while the L3 cache is shared among all cores. Note that the L3 cache has been engineered to be variable in size, so that different products may offer different L3 cache sizes. The L1 and L2 caches are exclusive, as with current Opterons and Athlon 64s. This means that the L1 and L2 cache don't hold copies of the same data.
Re:Quad-core vs. dual-dual-core? (Score:5, Informative)
There are also process challenges. Two dies take more space than 4 cores on one die since you have replicated some of the technology [e.g. FSB interface driver for instance]. Space == money therefore it's more costly.
If one dual-core takes 65W [current C2D rating] than two of them will take 130W at least [Intels ratings are not maximums]. AMD plans on fitting their quadcore within the 95W enveloppe. Given that this also includes the memory controller you're saving an additional 20W or so. In theory you could save ~55W going the AMD route.
Also currently, C2D processors have lame power savings, you can only step into one of two modes [at least on the E6300] and it's processor wide. The quad-core from AMD will allow PER-CORE frequency changes [and with more precision than before] meaning that when the thing isn't under full load you can save quite a bit. For instance, the Opteron 885 [dual core 2.6Ghz] is rated for about 32W at idle down from 95W at full load. I imagine the quad-core will have a similar idle rating.
Tom
True QC versus MCM: (Score:5, Informative)
When all four cores are on a single peice of Si, all sharing a L3 cache, the chips don't need to fight over the external bus as much. The cores can share information between them internally, and do not need to touch the slow external bus to perform cache coherency and other synchronization. Also, true QC chip presents one load to the outside bus. This means that the bus speed does not need to drop because of electrical load.
There are many people who don't care how the cores are connected as long as the package works. The point is that the way the cores are connected have a direct impact on performance. We'll be talking about Intel vs. AMD cache hierarchy in 2007 when AMD uses dedicated L2 and shared L3 while Intel uses only shared L2. Expect cache thrashing on Intel's true QC chips with heavily threaded loads when it comes out. Next I'll hear people say that the cahce doesn't matter as long as it works. As long as it works for what? Single-threaded tiny-footprint benchmarks like SuperPi or Prime95? How about a fully threaded and loaded database or any other app that will actually stress more than the execution units?
Re:Note to AMD: We don't care (Score:2, Informative)
Re:Note to AMD: We don't care (Score:2, Informative)