Intel Core 2 'Penryn' and Linux 99
An anonymous reader writes "Linux Hardware has posted a look at the new Intel "Penryn" processor and how the new processor will work with Linux. Intel recently released the new "Penryn" Core 2 processor with many new features. So what are these features and how will they equate into benefits to Linux users? The article covers all the high points of the new "Penryn" core and talks to a couple Linux projects about end-user performance of the chip."
Re: (Score:2, Informative)
Re: (Score:1)
Perspective (Score:5, Insightful)
"So we do not plan on adding SSE4 optimizations. We may use SSE4 instructions in the future for convenience once SSE4 has become really widely supported. But I personally don't see that anytime soon..."
I think that puts the hype over penryn into perspective. There are some nice improvements energy leaks and such, but it's nothing revolutionary.
Re:Perspective (Score:5, Interesting)
I couldn't find the article with a quick Google, but I'm sure someone will dig it up.
Re: (Score:3, Informative)
Re:Perspective (Score:5, Informative)
Re: (Score:2, Informative)
Re:Perspective (Score:4, Insightful)
Can CPU performance hit a threshold? Sure it can. But maybe by then they will be integrating specialty processors for video encoding/decoding, data encryption, or for file system/flash write optimization, onto the CPU die. At some point nothing more will be required for corporate america to run word processors and spreadsheets, and tech spending and development will shift to smaller, virtual reality type applications rather than the traditional desktop. I think we have already reached the point where the desktop computer fulfills the needs of the typical office worker. The focus shifts to management & security over raw performance.
Re: (Score:3, Interesting)
Re: (Score:1)
This may be true as M$ may not ever release a new OS after the massive failure of Vista. The only way it sells is to force it down the throats of those buying new computer who don't have a clue. Even many of those buyers are buying XP to replace that piece of crap.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1, Interesting)
Re: (Score:1)
I think you have no idea what you are talking about. If you take Silicon as an example its crystal form has atoms separated by about 1 nm (nanometer), but if you add an impurity its effect spread on a radius of the order of 10 nm.
So, you cannot go under 10 nanometer because separate circuits in the micro-(nano?)-processor start to i
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Also, Intel has a tech page [intel.com] where they describe this 2 year cycle.
Re: (Score:2)
Re: (Score:1, Informative)
RAM is another area that needs work. I mean, RAM speeds are getting very slow and caches aren't big enough to avoid being saturated by modern softwa
Re: (Score:2)
Hard drives could also be improved. If you had intelligent drives, you could place the filesystem layer in an uploadable module and have that entirely offloaded to the drive. Just have the data DMAed directly to and from the drive, rather than shifted around all over the place, reformatted a dozen times and then DMAed down
Uhh what? Just a couple of lines above you said CPUs were overpowered, and now you want the filesystem code to run on the hard drive ? Specialized hardware maybe faster but which filesystem are you gonna have running on your hard disk? NTFS? ext3? ZFS? Reiserfs?
Re: (Score:1)
The idea behind it is that specialized "application" processors in architected groups will comprise an "operating system" of sorts married with hardware that will be quicker than a generalized computer with a monolithic OS.
Re: (Score:1, Informative)
Its been a long time since raw computational power has driven CPU development. Almost all improvements are about further hiding latency. Your comments about ram, disk, etc. are all about I/O latency. This has been and will continue to get worse. Its been getting worse for decades and the majority of what CPU designers think about is how to deal with that fact.
From your comment
Re: (Score:1)
Re: (Score:2)
That's true for sufficiently brain-dead definitions of "revolutionary." Hafnium based High-K transistors are revolutionary. Instruction throughput isn't everything. Manufacturing technology needs breakthroughs too. Or did you see no point in the continuous shrinkage from 100 microns down to where we are now?
Re: (Score:2)
I remember when Unreal was released, it had software rendering via 3dnow, and it was far from satisfactory, and not just in resolution, turning that down still led to problems.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
I think that puts the hype over penryn into perspective. There are some nice improvements energy leaks and such, but it's nothing revolutionary.
Improvements in fabrication technology have nothing to do with improvements in the ISA, beyond the extent to which the ISA relies on the performance provided by the process. The process improvements in Penryn are revolutionary. 45nm on hafnium gates with a whole slew of other process changes needed to make that work is something that five years ago wasn't even believed possible - I recall gloom-and-doom predictions that the brick wall was at 65 nm.
Practically, Penryn may be an incremental step, but the pro
If video encoding/decoding is the bottleneck... (Score:5, Interesting)
Also, an earlier comment that may be useful in this discussion: Why smaller feature sizes (45nm) mean faster clock times. [slashdot.org]
--
Educational microcontroller kits for the digital generation. [nerdkits.com]
Re:If video encoding/decoding is the bottleneck... (Score:5, Insightful)
Re: (Score:2)
Re: (Score:3, Interesting)
For example, many server workloads are handled best by a chip like Sun's UltraSparc T1, which doesn't have any floating point capabilities worth mentioning. People running that kind of server wouldn't buy a Xeon or Opteron that had a 600M-transistor vector processor. It's a h
Re: (Score:2)
Is it? How much does that slot, bus, southbridge, etc., cost? CPUs are cheap! Certainly cheaper than most graphics cards. And the proximity to L1/L2 cache and computational units might make for some interesting synergy.
Re: (Score:2)
With PCI express and the bandwidth it can handle, it might be the best option to put it either on a separate daughter card or allow a separate video card to be installed and dedicated specifically to this. Either way, processor, daughter cards, or video c
Re: (Score:2)
The place for hardware decoders is on the graphics card. Hence the reason why Linux needs to use the CPU.
Why? If you're going to be displaying the video on screen, then yeah, it makes sense to have it on the graphics card. But why can't we just have a general-purpose codec card? What if I don't want to display video, I just want to encode/decode it? Surely this is such a fundamental need that it deserves its own chip. If they can fit an encoder into a 1-pound handheld digital camcorder, why can't they p
Re:If video encoding/decoding is the bottleneck... (Score:5, Informative)
instead of trying to cram extra instructions
Cram? Chip designers get more and more transistors to use every year. I don't believe there's any "cramming" involved.
into an already bloated CISC CPU?
You're about 15 years out of date. The x86 isn't exactly a CISC CPU, it's a complex instruction set that decodes into a simpler one internally. Only the intel engineers know how they added the SSE4 instructions, but based on the comments of the encode/decode guys, these new instructions sound a lot like the old instructions. It's not too hard to imagine that they didn't have to change much silicon around, and maybe got to re-use some old internal stuff and just interpret the new instructions differently.
Anyway, so why not just have a dedicated piece of silicon for this exact purpose? Partly because it'd be more expensive (you'd have to basically implement a lot of the stuff already on CPU like cache, etc), but also because it's just too specific. How many people really care about encoding video? 5% of the market? Less?
Hardware decoding on hardware is already a reality, and has been for some time. GPUs have implemented this feature for at least 10 years. But of course it's generally not a feature that has dedicated silicon, it's integrated into the GPU. If this is the first you've heard of it, it's not surprising. The other problem with non-CPU specific accelerations is they don't ever really become standard, as there's no standard instruction set for GPUs, and ever a GPU maker may just drop that feature in the next line of cards.
In short, specialized means specialized. Specialized things don't tend to survive very well.
Re: (Score:2, Insightful)
I don't know why you seem to think video encoding is some sort of niche technical application that no one uses. A huge number of people record video on digital cameras and want to email it or upload it without taking too long. Many people now use Skype and other VOIP software supporting real-time video communication. Many people rip DVDs. Many people (although not a huge number) have "media center" PCs which can record video from TV
Re: (Score:2, Interesting)
Someone is definitely not a mainstream CPU designer! It never all fits... ask any floor-planner.
x86 not CISC?! (Score:5, Interesting)
x86 has a hella complex instruction set, and it's decoded in hardware, not software. On a computer. So: it's a CISC. A matter of English, sorry, not religion. Sure the execution method is not the ancient textbook in-order single-level fully microcoded strategy - but it wasn't on a VAX, either, so you can't weasel out of it that way. ;)
Of course, the problem isn't with being a CISC, anyway. Complex instruction sets can save on external fetch bandwidth, and they can be fun, too! It was true 25 years ago, and it's still true now. CISC was never criticised as inherently bad, just as a poor engineering tradeoff, or perhaps a philosophy resulting in such poor tradeoffs.
The real point is twofold, and this: first, that the resources, however small, expended on emulating (no longer very thoroughly) the ancient 8086 are clearly ill-spent. While this may have come about incrementally, it could all by now be done in software for less. And second, while don't write assembly code any more, we do still need machines as compiler targets; and a compiler either wants an ISA that is simple enough to model in detail (the classic RISC theory) and/or orthogonal enough to exploit thoroughly (the CISC theory). Intel (and AMD, too, of course; the 64 bit mode is baffling in its baroque design) gives us neither; x86 is simply not a plausible compiler target. It never was, and it's getting worse and worse. And that is precisely why new instructions are not taken up rapidly: we can't just add three lines to the table in the compiler and have it work, as we should be able to do; we can't just automatically generate and ship fat binaries that exploit new capabilities where they provide for faster code, as must be possible for these instruction set increments to be worthwhile.
Consider, for example, a hypothetical machine in which there are a number of identical, wide registers, each of which can be split into lanes of any power of two width; and an orthogonal set of cleanly encoded instructions that apply to those registers. CISCy, yes, but also a nice target that we can write a clean, flexible, extensible compiler back end for. Why can't we have that, instead? (Even as a frikkin' mode if back compatibility is all and silicon is free, as you appear to argue!)
It shouldn't be a question of arguing how hard it is or isn't for the Intel engineers to add new clever cruft to the old dumb cruft, but one of what it takes to deploy a feature end-to-end, from high level language source to operations executed, and how to streamline that process.
So, sure, give us successive extensions to the general-purpose hardware, but give them to us in a form that can actually be used, not merely as techno-marketroids' checklist features!
Re: (Score:2)
I'm afraid I find this a little hard to interpret. It's a traditional thing to have flamewars about the 'point' of RISC (simply because - we're back in history here - the argument of H&P was 'measure twice, cut once' and RISC - an object, a project and a design schema - was just an application of that philosophy at a particular point in the technology curve), but the idea of RISC was most specifically to tune the visible ISA to the needs of a compiler back end (along with the execution engine, technolog
Re: (Score:2)
It sounds like Penryn has a bunch of slightly-neat features that we'll start taking advantage of sometime in 2025.
Re: (Score:1)
Heck, for my Core 2 Duo, I need to use the -march=nocona flag for compilation, and if I can recall correctly, that was originally added for the Prescott or similar: "Improved version of Intel Pentium4 CPU with 64-bit extensions, MMX, SSE, SSE2 and SSE3 instruction set support." Though, I doubt the software interface to the CPU is that different.
- Neil
Re: (Score:1)
Sean
Then why iNTEL? (Score:1)
Part of the problem is that we (still) don't really know how to design a CPU that is easy to compile fast code for (e. g., in all situations).
Re: (Score:2)
Intel wins in the market partly because it rides on Microsoft's coat tails (why Microsoft wins in the market is another long story, of course), and partly because it has fabrication technology that is actually sufficiently better than the competition to dominate most negative effects of architectural decisions. That in turn is because of economies of scale, and I understand was originally bootstrapped through their memory business, rather than by CPUs as such. And if it wins so utterly in the market, then i
Re: (Score:1)
Re: (Score:1, Insightful)
Anyone who wants to video conference and doesn't have tons of bandwidth.
Re: (Score:2, Funny)
As opposed to hardware decoding on software?
Or redundant redundancies of redundancy?
Re: (Score:2)
Re: (Score:1)
It's probably not, but as far as I'm aware, Thomson's Mustang ASIC is the only commonly available one.
Most hardware video encoding is done with general-purpose DSPs and specialised software.
Re: (Score:1, Interesting)
Re: (Score:1)
1. That's no moon = Force Choke [xkcd.com]
2. But does it run Linux = hmmm....
More Useless Options (Score:5, Insightful)
This just reminds me of CONFIG_ACPI_SLEEP. About 2 times a month I am staring at this option wondering if I will ever get to use it. Some things just are not worth developer time to implement.
Re:If Intel had a better bus then they would not n (Score:5, Insightful)
Re: (Score:1)
Who gives a shit? (Score:5, Insightful)
Also as for bus speed, you might note that the real limiting factor is RAM speed. It is pricey to get faster RAM, and that's ultimately where you've got to go for non-cached data. You can build as fast a bus as you like, if you are waiting on the RAM it gains you little.
Penguin (Score:1)
Re:Penguin (Score:5, Funny)
Re: (Score:1, Redundant)
I miss my GF.
Remember MMX ? (Score:4, Informative)
Re: (Score:1)
Re: (Score:1)
MMX was not useless. Despite its marketing name, it didn't have a whole lot to do with multimedia (though it did have obvious applications in multimedia). It was x86's initial introduction to vector/SIMD instructions. The ability to perform the same instruction to 4 numbers at once (rather than using a loop) was a huge boon. Intel might have marketed it strangely, but to some degree it was Intel playing catch-up to other architectures which had already added vector instructions.
It's true though that we di
Being an Early Adopter Sucks (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
Adobe Flash and Opera.
OK, it's not hardware or even drivers, but it's enough to make me regret installing 64-bit Ubuntu.
Re: (Score:2)
I'm still running swiftfire or swiftfox or whatever that allowed it. I haven't tried Opera since upgrading to 7.10, but just googling "opera 64-bit ubuntu 7.10" Take this [trentrichardson.com] site for exampe:
Be aware before you install the 64bit version that you will not be able to install Flash, Opera, Wine, Komodo Edit, or any of the new cool Adobe Air products. Boy this is got me where it hurts being a web developer. Now none the less, most of these can be installed by following the tutorials for installing on a 64bit machine, but what I would really love to see in future versions is by default, Ubuntu have the capability of installing and running 32 and 64 bit versions of software. Now I've got no clue how one would begin creating such a work of art, but Apple did it, and I have full faith in the Ubuntu community.
There are no t
Re: (Score:1)
sse4 (Score:1)
seriously, sha-1 in microcode would be hella fast
Re: (Score:1, Informative)
Re: (Score:2)
Via has it for years: AES, SHA, and much more... (Score:3, Insightful)
Re:Via has it for years: AES, SHA, and much more.. (Score:1)
Apple's change to LLVM (Score:5, Interesting)
LLVM == Hot Air (Score:1, Interesting)
Re:LLVM == Hot Air (Score:5, Informative)
That sounds pretty practical to me.
Re: (Score:2)
Apple used LLVM to improve the performance of software-fallbacks for OpenGL extensions by a hundred fold [arstechnica.com] in Leopard, and the big part of that was because it was good at optimizing high-level routines depending on the low-level features of the chip, such as Altivec/SSE2 32bit/64bit, PPC/x86 etc. So it stands to reason that, to the extent that SSE4 is usefull, LLVM will make good use of it, just like it did for other extensions.
If a new compiler frontend/backend/whatever improved the performance of those routines 100x, it's because the original routines were horribly inefficient. That is a simple fact and it is still true even when Apple is involved.
Re: (Score:2)
Apple is driving the costs behind LLVM. They are accelerating it's development goals and no GCC was not capable of providing the improvements to OpenGL and Quartz that Apple needed.
Apple buys CUPS and makes sure it remains the same licensing while paying the salaries of its develo
All That Matters Is... (Score:2)
It does, end of discussion. Everything else is simply about applications.