Intel Turbo Boost vs. AMD Turbo Core Explained 198
An anonymous reader recommends a PC Authority article explaining the whys and wherefores of Intel Turbo Boost and AMD Turbo Core approaches to wringing more apparent performance out of multi-core CPUs. "Gordon Moore has a lot to answer for. His prediction in the now seminal 'Cramming more components onto integrated circuits' article from 1965 evolved into Intel's corporate philosophy and have driven the semiconductor industry forward for 45 years. This prediction was that the number of transistors on a CPU would double every 18 months and has driven CPU design into the realm of multicore. But the thing is, even now there are few applications that take full advantage of multicore processers. What this has led to is the rise of CPU technology designed to speed up single core performance when an application doesn't use the other cores. Intel's version of the technology is called Turbo Boost, while AMD's is called Turbo Core. This article neatly explains how these speed up your PC, and the difference between the two approaches. Interesting reading if you're choosing between Intel and AMD for your next build."
PS. (Score:4, Interesting)
For that matter, can we have one more thing: a way to limit max core usage to, say, 10% (imagine you're playing an old game on a laptop, for example Diablo2; now, many games have the unfortunate habit of consuming all available CPU power...whether they need to or not; taking battery with them)
Re:"Apparent performance" (Score:5, Interesting)
Why not use the extra transistors... (Score:3, Interesting)
...for more cache instead of more processors? Think of something with as many transistors as a hex core but with only two cores and the rest used for L1 cache! I'd suggest lots more registers as well, but that would mean giving up on x86.
Re:PS. (Score:3, Interesting)
What do you mean "that's why SpeedStep is used, against normal CPU throttling"? SpeedStep is CPU throttling; but on top of that C-states are also highly effective, or at least Thinkpad Wiki thinks so [thinkwiki.org], and I see no reason to disbelieve them...
Re:Cooling fan noise anyone? (Score:3, Interesting)
I bought an Intel i7-860 recently and the supplied HSF is barely able to keep the core temperatures under 95 deg. C with eight threads of Prime95 running. Eek!! I replaced it with a cheap Hyper TX3 cooler (larger coolers won't fit with four DIMMs fitted), and it run at least 20 degrees C cooler under the same conditions. The supplied fan is a little noisy under full load, but for gaming etc. it's not a problem.
Turbo Boost is cute, but I've opted to overclock it at a constant 3.33GHz (up from 2.8GHz) instead for predictable performance, with no temperature or stability issues. YMMV.
Re:Can we get.. (Score:4, Interesting)
Yes but it consumes more power and heat than they'd like. Binning is also a bigger deal than you think with CPUs. My CPU can be over-clocked significantly, because I got a lucky unit, but not nearly as much as what some people get. My CPU isn't stable at the memory speeds most over-clockers see online either. So in some ways, I got a good CPU, in some, it's meh.
On the other hand, there's no way I'd sell a company my CPU & motherboard at the speed I've boosted it up to. Not a chance. It's not 100% stable, there are infrequent glitches, etc. I improved my cooling, decreased my over-clock, and I've still had it not wake up from s3 sleep and done a couple other odd things.
So no, super turbo boost is not what you think it is. Is it a marketing ploy? Everything is a marketing ploy, but it's also a useful feature. Especially on laptops, where all but one core of the CPU can completely shut down and the remaining one can nearly double its clock speed.
Re:Why not use the extra transistors... (Score:5, Interesting)
Larger caches are slower. Moving to a larger L1 cache would either require that the chip run at a lower clock rate, or increase the latency (increasing the length of time it takes to retrieve the data).
As for registers, they did increase them, from 8 to 16 with x64. IIRC, AMD stated that moving to 16 registers gave 80% of the performance increase they would have gained by moving to 32 registers.
Re:Can we get.. (Score:5, Interesting)
So what you do is get people to code apps that use lighter-weight threads. Apple's GCD and FOSS ports of GCD spawn low-cost (as in overhead) threads so you can cram more in, make them smaller, and relieve part of the dirty cache memory problem in using them.
Spawn threads across cores, keep thread life simpler. Make those freaking cores actually do something. It can be done. It's just that MacOS or Linux or BSD have to be used to run the app/games.
Don't get me started on GPU threading.
Re:Why not use the extra transistors... (Score:3, Interesting)
PII is also of the PPro lineage. And even if PII, PIII and to some degree P-M and Core1 aren't that different, there are supposed to be some notable changes with Core2 and, especially, Nehalem.
Besides, if the tech is good and it works... (look what happened when they tried "innovating" with PIV)
Re:Why not use the extra transistors... (Score:3, Interesting)
Re:Question... (Score:2, Interesting)
Looking at history of CPU time to running time ratio for each process, or perhaps also what typically causes spikes of usage and moving process to faster core at that point? Plus central db of what to expect from specific processes.
(I'm not saying it's necessarily a good idea; just that it could be not so hard, OS-wise)
Re:Cooling fan noise anyone? (Score:5, Interesting)
predictable performance
Predictable power-drain, you mean, and a predictable shortening of the life of your hardware -- assuming it doesn't just overheat and underclock itself, which I've seen happen a few times.
CPU scaling has been mature for awhile now, and it's implemented in hardware. Can you give me any real examples of it causing a problem? The instant I need that speed (for gaming, etc), it's there. The rest of the time, I'd much rather it coast at 800 mhz all around, especially on a laptop.
with no temperature or stability issues. YMMV.
Understatement of the year.
Overclocking is a bit of a black art, for a number of reasons. First problem: How do you know it's stable? Or rather, when things start to go wrong, how do you know if it's a software or a hardware issue? The last time I did this was a 1.8 ghz machine to 2.7. I ran superpi, 3dmark, and a number of other things, and it seemed stable, but occasionally crashed. Clocked it back to 2.4, it crashed less often, but there were occasionally subtle filesystem corruption issues -- which was much worse, because I had absolutely no indication anything was wrong (over months of use) until I found my data corrupted for no apparent reason. Finally set it back to the factory default (and turned on the scaling) and it's been solid ever since.
Second problem: Even with the same chip, it varies a lot. All that testing I did is nothing compared to how the manufacturer actually tests the chip -- but they only test what they're actually selling. That means if they're selling you a dual-core chip that's really a quad-core chip with two cores disabled, it might just be surplus, the extra cores might be fine, but they haven't tested them. Or maybe they have, and that's why they sold it as a dual-core instead of quad-core.
So even if you follow a guide to the letter, it's not guaranteed.
I'm sure you already know all of the above, but I'm at the point in my life where, even as a starving college student, even as a Linux user on a Dvorak keyboard, it's much saner for me to simply buy a faster CPU, rather than trying to overclock it myself.
Re:Why not use the extra transistors... (Score:1, Interesting)
Some architectures are more dependent on cache size than others; the Pentium 4 was awful when cache-starved -- the first round of P4-based Celerons were utter dogs with only a 128k L2 cache and 400FSB, for example, while the jump from Northwood (512k L2) to Prescott (1MB L2) actually hurt performance for many workloads as the cache increase didn't really help with the increase in pipeline length and the clock speeds didn't scale up initially. By contrast, the jump in performance from 533FSB to 800FSB dual-channel between the B and C northwoods was big.
Core 2 was a remarkably more tolerant of slow memory and smaller caches.
Because Intel knows their history (Score:4, Interesting)
When Intel came out with the Pentium Pro, they had a good 32-bit machine, and it ran UNIX and NT, in 32-bit mode, just fine. People bitched about its poor performance on 16-bit code; Intel had assumed that 16-bit code would have been replaced by 1995.
Intel hasn't made that mistake again. They test heavily against obsolete software.
Re:Can we get.. (Score:3, Interesting)
How about a small daemon that at intervals re-assigns the running processes to the cores in a balanced way (or one of your choice), and also sets the affinity for new processes. Should be about 30 minutes with any fast language of your choice that can call the appropriate commands.
I think you could even do it with bash, although it would not be very resource-saving. (Hey, everything is a file even those settings! If not, then they did UNIX wrong. ;)
Remember: You are using a computer. Not an appl(e)iance. You can automate whatever you wish yourselves. And usually pretty quickly too. :)
correct me if i'm wrong, but (Score:4, Interesting)
Correct me if i'm wrong, and maybe i'm missing something here, but i think it's possible to simulate this kind of functionality on Linux with a script. Cores 2 to N are taken offline (echo 0 > /sys/devices/system/cpu/cpu/offline), the "performance" governor is used for cpu0 (which causes it to run at full clock), then the script monitors usage of cpu0 and brings the other cores online as load on cpu0 goes up. When load goes down then the other cores can be taken offline again.