Pentium 4 Re-evaluated, Again (Again) 136
An unnamed correspondent writes: "It looks like Tom's Hardware Guide has been busy with the P4. This time a re-compiled version of the MPEG encoder (the same one they benchmarked with in the last article) shows the P4 doing really well. Also interesting is the performance boost that even the PIII and Athlon procs get from the Intel compiler. Take a look at the article here." Seems that as usual, benchmarks are what you make of them. The P4 apparently can perform much better than initial tests have shown. Tom Pabst makes some good (if fawning) points about the complexity and fairness of benchmarking in general, too.
Re:Is he for real? (Score:1)
Re:Pretty neat. (Score:2)
Guess what? We already have. The 486 had instructions that were not available on the 386; programs that use them cannot be run on the 386.
In short, the definition of x86 you seem to be using doesn't, and never, existed. Never have you been able to use all the instructions on post-386 processsors on 386es, any more than you could do so with the 286 or 086. x86 compatibility has always run the other way, and it's still 100% the other way -- all the instructions for the 8086 are available on a P-IV.
Re:Intel/Amd/Cyrix "performance" ? (Score:1)
What do benchmarks mean to you? (Score:2)
Hi folks,
I think it's kind of ridiculous that most folks don't understand the concept of benchmarks. It's common knowledge among hackers that benchmarks test specific aspects of performance, and can be made to show better or worse performance depending on what the benchmark author wants to say. Unfortunately, many folks (maybe not you but many other folks like you do this) base purchasing decisions on benchmarks and spend hundreds of dollars more than necessary on hardware they don't really need.
Being a programmer myself, I know just how flippin powerful even the "outdated" CPUs are. Recently, I have worked on the Pentium III, the Celeron, and an old 486 at 66mhz. Most of my recent works are prototypes built for ease of maintainence and clarity rather than performance, and if I may say so myself, they do perform extremely well, even on the 486. I'm sure there are areas in computing that a powerful workhorse CPU like the Pentium III or 4 is needed, but what most readers probably don't know is that there are literally thousands of mission critical, real-time computer systems out there that run on 4- or 8-bit computers at speeds like 1 or 2 mhz, and they get the job done. Every user action is carried out instantaneously. The ridiculous part is that most folks out there don't understand that a newer CPU won't get them better overall performance. The user still needs to wait for the hard drive to churn, the network card to accept incoming packets, and a thousand other things; besides, it's really the software algorithms and implementation that causes the performance, or lack thereof. (These are the reasons I don't like Intel's claim that their newest CPU will give the user a better Internet experience.) The only place a faster CPU will get you performance is in tight code containing nothing but intense computations. Most folks will think of games when thinking of intense computations. In this case, I agree that it is critical to play Quake at 230 fps rather than 200. :)
I apologise for being so blunt in my comment but I need to run out the door so I'm in a hurry, and I'm kind of frustrated at the things that happen because of marketing and "benchmarks" that don't really mean anything (at least to myself). I hope I was able to successfully convey my point without insulting anyone. Hopefully, someone can comment on this and either help me out or prove me wrong... I'm open to others' suggestions
Kind regards,
Nathaniel G H
Re:Is he for real? (Score:1)
In the mean time, I think it's good that he's rerunning his tests as he gets new data. That's one of the things that's great about the web, nothing is static, and news can react as quickly as events change, instead of magazines which have press deadlines so they have to publish whatever they can by a certain date, only to print a tiny correction a few months later. Tom's being upfront about what's going on, and keeping all of us much more informed than we would have otherwise been...
Re:Speed or compatibility, which do you want? (Score:1)
Eventually, newly released code will contain the newer instructions. And eventually, people will upgrade their software if they think the increase in performance is worth it. Basically, this technology will be adopted eventually, no matter what any reviewers said. Remember when the Pentium Pro was released and ppl were complaining that it ran 16-bit code slower then the regular Pentium? Its happening again, this time its SSE2 vs. non-SSE2 and whatnot. I see myself owning a P4 this time next year when they crank it up to 2Ghz and beyond... AMD will need to eventually come up with a new microarchitecture because the fact is the Athlon does not scale as well as the Pentium IV.
Re:P4 Benchmarks are more controversial... (Score:1)
And how is encoding video to mpeg 4 less usefull than photoshop filters and povray renders? Those three are all the exact same operations, basically. If anything, mpeg encoding puts more strain on the memory system than a photoshop filter would, simply because you're dealing with 4 to 8 gigs of data running through the CPU rather than a 30 megabyte file.
Re:The game's rigged! (Score:1)
Re:Recompiling....For normal users? (Score:1)
But I have no desire to compile all of my command line tools. Wow... I could eke out a little extra performance from grep and ls? I've an Athlon 700, so even if the software is horribly unoptimized, the machine more than makes up the differnce.
Re:I wonder if this will be a GPL test case... (Score:3)
lincense to anyone who receives the binary. Tom would therefore be
entitled to the source. Unless you receive the binary, you would not.
Re:I wonder if this will be a GPL test case... (Score:1)
Question is, who should be notified of this problem?
Also I read every now and then about how if you do not protect your IP rights, then they can be taken away? if so, does this apply here too?
Re:I wonder if this will be a GPL test case... (Score:2)
Tom would then be entitled to get out the source, since according to the GPL, Intel could not restrict him from doing so.
--
Re:The game's rigged! (Score:1)
Lets assume that the Pentium 4 (and its derivatives) will scale just like the P6 core - Pentium Pro/II/III.
Lets also assume that the fab process technology will improve just like it did during the lifetime of the P6 core.
And lets assume Moore's law is actually true -- it's a universal constant
The P6 core started at 150Mhz and it reached 1 Ghz. The Pentium 4 started at 1.4Ghz, so based on the above assumptions, it will reach at least 9.33 Ghz, and 10Ghz is not too far from there.
This may or may not be reasonable, you decide. But it's just interesting to think about...
Interesting... (Score:1)
Freaky.
Re:I wonder if this will be a GPL test case... (Score:1)
the source must accompany the binary (if requested) - the Intel engineers only supplied Tom with the binary, so only Tom can ask for the source. I suspect nobody can force them to release the binary and/or code to anyone else.
comparison (Score:1)
Like the post I saw earlier (I think it was #10) about the recount, it made me realize that the two situations were comparable. Look at it, you have two parties - Republican, Democrat, Intel, and AMD (you decided the comparison between those) - and when one issue comes up where either party can gain an advantage (Florida and Flask) they dive on it with their spin-doctors to fit the results to their will.
Neither one will give up, and (as far as I'm concerned) both are only focusing on money and not the best interests of the people or customers (as it should be).
Oh, and a question for those more knoledgeable in programming (I'm only 1st year college C++), why would a simple recompile benifit the scores? Since it's an MPG-4 encoder shouldn't it already be more or less upto speed?
Re:here spambots (Score:2)
I demand satisfaction.
Athlon 3DNow! optimization? (Score:1)
Anyway, are those optimization really that necessary? 99% of the software that people used does not need those optimisation. How many of your friends uses video editing/compression software? And if they do, they used it how often?
But still, all the hype is kind of fun... or rather, life is kinda of boring.
Competition is GOOD (Score:1)
I had my doubts about the P4 and SSE2 in general (Score:1)
Intel fan or not, a lot of people must at least be interested! I sure am.
Same situation with the alpha (Score:1)
Re:I wonder if this will be a GPL test case... (Score:1)
Which is what the post said. It was only a couple of sentences. You really couldn't be bothered to read the whole thing?
Pete
Re:Pretty neat. (Score:1)
Right, but would you like to encode a MPG-4 movie on a 386? You'd be twiddling you thumbs for a loooong time.
Re:and you call yourselves nerds... (Score:1)
I can't believe I ever enjoyed this place. If this were a magazine I'd ask you to discontinue my subscription.
seeya, suckers.
Tom's Hardluckware (Score:1)
humor for the clinically insane [mikegallay.com]
Please, enough with Tom's!!!! (Score:1)
AMD would have got a flogging if the released this (Score:1)
BTW Does it amuse anyone else that all these benchmarks are being done with Windows 98. What sort or retarted moron would buy one of these puppies, then run Lose 98? Their performance must _realy_ suck under Lose 2000!
Re:your addy is confusing (Score:1)
The bottom line (Score:1)
Re:The game's rigged! (Score:1)
Nothing can improve *more* than linearly with clock speed (which I assume when you say "CPU power"). Linear increase is the upperbound. Often the increase is lower due to memory (and other) bottlenecks.
When I said linear, I meant linear with a slope of 1. My mistake. (The slope is greater than 1 -- that is, a 10% increase in clock speed can give a slightly greater than 10% performance boost -- because of the constant overhead of Windows, device drivers, IDE CPU utilization, FlasK MPEG, etc., that is less relative to higher clock speeds.)
The original purpose behind the P4 is to be able to crank up the clock speed. Looks like they have reached their goal and even increase the IPC by just a little. So it looks like this will be a win for Intel. When the Athlon reached its max clock speed, and the P4 continues to crank up its speed all the way up to 10Ghz, you'll see.
The original idea was to introduce a new architecture. Intel claimed that the Willamette (sp?) core would greatly increase the speed per clock cycle -- in other words, a 1.5 GHz P4 would be as fast as a faster (in GHz) Athlon. That's what we're paying a premium for. What this shows is that in the best case scenario, when something is specfically recompiled for the P4 and it's a task that it performs best, the P4's advantage is almost statistically unimportant.
Again, its only real advantage here is the clock speed versus the Athlon. True, the P4 is very scalable, but so is AMD's next processor core. The P3 was also very scalable, but AMD ended up beating out Intel when they introduced the Thunderbird core revision. The same thing's likely to happen here -- Intel has the technicaly advantage here that they can crank up the clock speed in the future, but by the time they actually do it, AMD will have its next core ready, and they'll be able to do it to.
Re:Recompilation and the GPL (Score:1)
From the GPL:
Intel's violating this section of the GPL by distributing a work based on FlasK MPEG and telling Tom that he can't redistribute the binary. The GPL prohibits them from telling Tom what he can do with the binary, or even the source for that matter.
Re:Pretty neat. (Score:2)
AAAAAAAAnyway, I quote [The Register] [theregister.co.uk]:
"Reader John Welter of North West Group, a Canadian Geomatics firm specialising in orthophotography - stretching accurate photographs of the Earth's surface over elevation models of the same area - volunteered us some interesting information on his company's experiences with an early P4 system.
When using the original code, a P4 system took a glacial 19 hours compared with just under 13 hours for a 933MHz PIII. But with code recompiled to use SSE2, the P4 galloped through the test in a shade over seven and a half hours.
Outperforming Alpha
-------------------
"A P4 at 1.5Ghz is now faster when running optimised code then our Alpha production boxes by a sizable margin, where those same Alpha boxes outperformed all our P3 based systems.
"Intel did not take the x87 FPU performance as a prime design goal in the P4. They focused on the SSE/SSE2 unit much more and made sacrifices to the X87 FPU side of things to gain more SSE2 performance. Some may argue this was a bad trade-off but the improvements they have managed on the SSE2 are very impressive.
"Geomatics is extremely CPU intensive and pretty much 100 per cent bound by CPU performance. For this reason we obtained an early 1.5GHz P4 despite the inflated costs in an attempt to determine how much added performance it would give us in reducing our production times."
===
The article then goes on to describe the sweet 'puter setup they use, describes how SSE/SSE2 are an advantage in this particular case, and describes how AMD also plans to support SSE/SSE2 and more.
http://www.theregister.co.uk/content/3/14982.ht
--
New tests (Score:2)
Re:I wonder if this will be a GPL test case... (Score:1)
That's what the email from Hans & Christian (@Intel) says. So to me it definately looks like they're planning on releasing the source-code. They're only a bit late.
p4 (Score:1)
The Pentium 4 Question (Score:1)
.--bagel--.---------------.
| aim: | bagel is back |
| icq: | 158450 |
No GPL Violation Yet; Move Along, People... (Score:2)
If he distributed it, then they would be obligated to provide the source.
I think their goodwill is probably more important to Tom (and the community) in this case. If they default on that, then Tom might as well distribute the program.
But until then, I'd rather he keep reviewing with their help.
---
pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
What this really tells us (Score:2)
Unfortunately, what it could potentially mean though is that if Intel were to do some sort of special deal with a proprietry OS maker (MS for example) they could make that OS run far faster than any others, simply because it'd be compiled with a better optimised compiler.
-- Piracy is a vicitmless crime, like punching someone in the dark.
This proves the usefulness of Open Source (Score:1)
Now where have i seen this before? (Score:2)
Re:Pretty neat. (Score:2)
Dave
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
Re:comparison (Score:1)
Friggin recounts ! (Score:2)
Which just proves.. (Score:1)
--
Re:here spambots (Score:1)
Compilers are getting MORE important (Score:1)
or any of these comments. However compilers
are getting more important with newer CPU
designs. Also intel is always going to have
the best compiler for x86 chips. And Compaq's
compiler is the only one to consider for an
Alpha. There should really be a campaign for
the CPU vendors to opensource their compilers
so that everyone can get the benefit and enhance
them etc. Like what have they to loose? They
will only sell more chips.
Re:Can we smell the hypocrisy? (Score:1)
Re:Athlon 3DNow! optimization? (Score:1)
Re:Your signature... (Score:2)
Dave
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
Re:The Pentium 4 Question (Score:1)
The P4 is pretty mediocre at current clock speeds, but a big advantage of it's design is it's apparent ability to handle high clock speeds that the P3 can't. So the P4 may not look too hot now, but Intel is expecting to get it up to 2GHz by the end of 2001, if they can do that they have a winner.
Most people don't seem very adept at the whole "long-term thinking" thing, particularly with technology product releases .. but people should really try look ahead a little on this one.
I pretty much agree with your third paragraph .. there just isn't the time to spend hand-crafting P4 assembler optimizations for 3d gfx (which I do for a living btw .. ) .. but the compiler should at least be making some attempt to use those instructions, if asked. Of course, we use MSVC, MS tends to not be very leading-edge in this regard.
Recompilation and the GPL (Score:1)
Re:Pretty neat. (Score:2)
In the first case, it'll go into the statistics - 300,000 people for a new architecture, 460,000 against. In the second case, it'll be, "Well, I met this guy and he was really pissed off that we're still using x86. He seemed to know what he was talking about, and he understood the difficulties involved. However, he really thinks that the sacrifices would be worth it."
Now, which will hold more weight?
Dave
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
Re:Pretty neat. (Score:2)
Fact is, Intel and AMD abandoned x86 to get real work done a long time ago. x86 is emulated on a modern processor, but at the hardware level. The core of the processor itself uses a different instruction set and format.
And like most things, x86 is just behind the times. Like all technology, tradeoffs had to be made. x86 was introduced way back when with the 386. It was designed to solve a specific set of problems in a certain way. Today we have a different set of problems, that also need to be solved in a different way. It's not that x86 was never good - it was very good at the time, and as evidenced by its long use, it had quite a bit of life in it.
Unfortunatly, the great strides that all the major chip manufacturer go through to pander to the x86 instruction set really cuts down on performance
Dave
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
Re:I wonder if this will be a GPL test case... (Score:1)
Re:The real problem.... (Score:4)
free compilers to create good support for their cpu. Why arent these optimisations in gcc ??
If regular users can not get hold of binaries compiled with good compilers, or the good compilers to compile their own stuff, then their real life usage of the cpu will look worse than the one in the review. The reviews shouldnt be done with special equipment, that being hardware or software, or with the aid of engineers that knows one side only. It should be done with standart equipment so we, the normal users would know what to expect.
Intel, AMD, and other cpu makers, that being x86 or not, give away the compilers, and see your hardware shine, or help GCC getting good support for your CPU, which we, the normal users can benefit from.
ion++
ps: there might be other free compilers than GCC
Re:Pretty neat. (Score:1)
Can we smell the hypocrisy? (Score:2)
Naturally, since it's not targetting performance, it benchmarks poorly. Do they (the various Quake 3 monkeys) rerun the benchmarks? No.
The Pentium IV comes out. People plug 'em in, benchmark them. They also suck. They benchmark them again, showing the suck by a larger margin. Then they benchmark them again, showing it's actually not such a bad suckage after all.
Isn't that just a bit of Intel favouritism?
--
Re:Recompilation and the GPL (Score:1)
Here's a direct quote:
"As agreed on the phone please don't distribute this version of flask to anybody else. We still haven't got hold on the author of Flask and we don't want to distribute this version without permission."
It's a little fuzzy, but it seems to me that Tom only agreed, but not necessarialy feels legally bound. I guess it all depends on what was said on the phone. I'd like to think intel only said "we don't like the code as it stands right now, so please understand that we wouldn't like to see it distributed", and that Tom said "OK.".
Re:The game's rigged! (Score:1)
Increasing IPC is a direct tradeoff to increasing clock rate (or the ability to scale the clock rate up.) You may be able to increase both to a certain extent through tricks here and there, but at the end, one of it will have to give in to the other. Intel's strategy (and pretty much most CPU designers) is to increase clock rate, because that is easier to do than increasing IPC. Increasing clock rate is pretty much a brute force strategy to increase performance, and increasing IPC actually need some brains. Intel managed to only squeeze out a little more out of the P4 by using the trace cache and some other stuff like double pumping their ALUs. No other processor in existence at this day have a trace cache. Before the P4, the trace cache was purely only a concept on paper. Also, I think they were smart to optimize for the SSE2 and not for x87 FPU instructions. Why? x87 instructions is legacy code, and is slow by nature. Why not design a separate floating point instruction set that is not "hacked on". Remember that the x87 is pretty much a hack on top of the 386 (integrated on the 486), and is pretty slow compare to how floating point is done in RISC processors like the Alpha or Sparc. With the SSE, Intel is planning to become more on par with those RISC processors (though it will never be as good IMHO). The Athlon may execute x87 instructions well, but let me assure you, it is still not as fast as it could be. And getting rid of x87 is a much better idea, but I doubt that will happen since everyone wants backward compatibility. I guess this is something we all have to live with for the rest of x86's life and our lives.
AMD ended up beating out Intel when they introduced the Thunderbird core revision
I wouldn't say they beat out Intel. AMD still only have 20% of the market share. Anyway you look at it, Intel is still the winner. I'm not being biased, but being objective. AMD may have a better processor, but at the end, Intel still sold more processors. This may be attributed more to their marketing prowess and ppl's brand name loyalty rather than any great engineering feats. But still that statement you made is wrong... The Athlon didn't win... it lost in fact. If it had won, AMD would have more than 50% market share. Of course, AMD did reach the goals they set, so it's a win from their and their supporter's perspective.
I wonder who's gonna read this since this thread/topic is already pretty old.
Re:Pretty neat. (Score:1)
See all the discussions of that set of problems when Itanium finally ships.
Re:oh yeah baby. this is the best comment i ever m (Score:1)
2. Yes. I spoke a little too quickly and it came out a little ambiguous. They optimized the decoder. It's possible that the optimizations, or even just a simple retargeted compile could cause difference in the raw output to the encoder. This would cause, of course, the output file to change. Same thing would be true if Tom was right and they actually had retargeted/optimized the encoder, which they didn't. Such an encoder could very well produce different output. So Tom's assertion that the output file would be "obviously" identical is clueless. Since Tom has said that the output files were identical, we can unsafely assume that the intel'ized decoders didn't actually cause any difference in the intermediate data. Of course, given his display of stupidity, I wouldn't be surprised if he didn't even do a checksum. I'm not an AVI expert, but I wouldn't be surprised if the filesizes came out identical due to padding or hard bitrate limits on the encoder. All issues which never appear to even have crossed his mind.
Or seem to be relevant to slashdot moderators.
Re:Can't Tom make up his mind? (Score:1)
As Usual (Score:2)
For most regular software, it did not mean squat. It only mattered for software that was specifically set up to take advantadge of the hardware features.
There were certain Photoshop filters that were fantastic at certain settings, but choked when you used others. This is typical of new feature.
I was always amused by comparing processors running at the same clock speed. Typically, when you do that, the gain in performance is usually about 15% to 25% (YMMV) before you add in the differance in clock speed. All too often the clock speed is a huge factor in the performance boost, not just the design changes.
I think I'll go have another beer....
MPEG-2 to MPEG-4 conversion questions (Score:1)
OTOH, although I know JPEG's internals pretty well, I'm not sure about MPEG
The game's rigged! (Score:1)
Before anybody shoots this down as anti-Intel flamebait, think about it. Why else would Intel risk the bad publicity of using an "illegal" program for benchmarking purposes? I mean, why not recompile gcc or OpenUT for P4 optimizations? Because they knew the P4 would do well at MPEG4 compression.
That's not to say that it would do any better than the Athlon. Some of the speed difference could be attributed to the P4's higher clock speed, but a lot of it is because of the Intel-compiled FlasK's SSE2 support. Had AMD recompiled FlasK to support the Athlon's 3DNow+ MPEG extensions, I'm sure the Athlon would have gained a lot of ground too.
The fact that it made such massive gains on all processors just speaks to that theory. I mean, think about it -- if Intel's compiler could magically make your average program run twice as fast, like it did for FlasK MPEG, developers would be lining up outside of Intel's offices for copies. This means that either (a) Intel changed the internals of the program to "cheat" by lowering quality and skip the parts that the P4 did poorly at, or (b) FlasK MPEG is a special case.
And the numbers aren't quite as amazing as Tom suggests. If you take the x87 version and assume for simplification that FPS scales up linearly as MHz increases (which is not true -- if anything, something as CPU intensive as MPEG compression will probably improve significantly better than linearly with CPU power) then virtually all of the speed is accounted for by the fact that the P4 is 1.5 GHz -- the Athlon gets 9.28 frames per GHz, whereas the P4 gets 9.35 frames per GHz.
I know this is a terrible method for comparing CPU power, but it shows the basic idea here -- that most of the P4's advantage is due to its speed in MHz, not the architecture. While this is all well and good for Intel if all programs are recompiled overnight for the P4 and Intel can continue to out-clock AMD (given Intel's recent history that's not too likely), in the real world AMD still has the advantage.
So what about those SSE2 scores? Tom glosses over the fact that all but the lowest score are for lower-quality encoding. So yay, we're getting 22.85 FPS . . . in the lowest-quality setting. The high-quality SSE2 setting gives a not-so-stunning 4 FPS boost over the x87 version, which isn't that great of a boost, considering that an Intel engineer hand-optimized the program to work better with P4s. (Well, Intel claims they just recompiled it. But I have yet to see a compiler that adds SSE2 instructions on its own, let alone one that can add new options to a dialog box with SSE2 features.)
At any rate, none of this helps the Office benchmarks, or the UT benchmarks, or the 3DS Max benchmarks . . . or any other benchmark that reflects performance computer users might get except in special cases. Not every computer user can use exclusively GPL software that he/she can recompile at will to support his/her new processor. Besides, that would take a while, since the gcc scores aren't very good either. :)
(And am I the only one that noticed that Intel modified a GPLed program and refused to distribute the source or allow Tom to redistribute it? Isn't that illegal?)
Re:I had my doubts about the P4 and SSE2 in genera (Score:1)
OSes are unlikely to benefit much from SSE optimization. And Athlon is quite a bit faster in integer performance.
It seems that the only well-designed unit in P4 is its SSE engine. On the other hand it might be more related to high latency of the Rambus memore than to processor design.
Re:comparison (Score:1)
Re:Poor compairson (Score:1)
Re:SPEC? (Score:2)
I'm not sure that SPEC2000 is an appropriate solution. Most people don't care about the performance of a "quantum-chromodynamics" simulation, and are not involved in compuutational fluid dynamics. The integer simulations are a little closer to home (word processing, chess playing, perl...) but unless your "real world" approximates the "real world" the benchmarks are trying to simulate, the results of such benchmarks are difficult to appreciate.
I supect that to many people, a Quake/Unreal benchmark is much more valuable than SPEC2000 results.
Re:Why? (Score:2)
DVDs have been around for about 3 years now, and yet DVD decoding chips aren't standardized on motherboards. We can expect the same for HDTV. Software decoding is going to remain pretty popular, as DVD + big mhz/ghz sells in CompUSA whereas selling 400mhz + decoder card = educating consumers = good luck.
Re:I smell a rat... (Score:2)
Re:Intel still my choice (Score:1)
============
i. There is no spoon.
Intel lies lies lies (Score:1)
Re:Pretty neat. (Score:1)
Intel still my choice (Score:1)
============
i. There is no spoon.
Intel's "Secret" Compiler-and the code it produces (Score:1)
RISC archs on the other hand seem to be geared more twards a general purpose processor, not just what Intel thinks computing should be like. When MP3s became big, it was a nightmare to create a fast iDCT on the x86 archs. On RISCs is was(and still is), a piece of cake. Simple programs take forever to execute on current X86s, but their RISC counterparts can blow through anything with almost no effort. RISC processors were designed for "computers", while CISC processors were designed mostly for machine controllers(anyone remember the stop-light tutorial on the early Intel CPUs!) and other things doing predictible repeating tasks.
Specialized vector processors are the other main type(in my eyes) of processor on the market. They are extremely single purpose, and are extremely fast. Someone could create a DivX-on-a-chip that could process a frame per clock cycle, run it at 10MHz, and blow every other CPU out of the water with >"real time" multichannel DivX encoding on one chip! This is where I am afraid Intel is going. Their chips can still run normal "computer" instructions at a small fraction of the speed of a RISC cpu at half the clock speed, but can run at an exceptable speed with most "consumer" low-grade multimedia(Don't get me started on my MMX is evil speech that everyone hears me give).
I would really like to see this SSE2 code that was "hacked" into FlasK. Even more interesting might be the dissassembled code that Intel's compiler produced. For all I know Intel could have done something as simple as recompiling FlasK in their standardized compiler, instead of something Microsoft produced.
If everyone was running LinuxPPC \ Linux/Alpha I would have no need to write this
BTW - Anyone working on a Linux-FPGA project?
Recompile your kernel and write it to your processors microcode-ROM. hehehe
Re:AMD would have got a flogging if the released t (Score:1)
Re:The game's rigged! (Score:1)
Benchmarks, hmm... (Score:2)
If you divide the benchmarks by the clock rate, you get a more objective view of the processors, independant of clock rate. By that measure, the P4 is turning out about
The small problem of course is that you don't get any of that bang with existing apps, which will undoubtably hurt P4 sales for a while. Strictly looking at bang-for-the-buck, the P4 is a poor choice at the current price points, but most new processors are. If AMD can get the Palomino (smaller,faster Athlon) out the door quickly, with SMP support, they could take a bite out of the server market that would have Intel rubbing their bottoms for some time.
_________________________________
Re:You only need to recompile a couple of things. (Score:2)
If anybody tries to build render farms off P4s they will literally go out of business due to missing deadlines. The P4 pipeline is too deep, the whole thing is geared towards flashy performance on consumer codecs and it's not general-purpose enough to perform realworld EFX tasks to deadlines. The high end market will be avoiding this one- it's just too easy to push it out of its 'high performance zone' and make it bog down.
Re:I had my doubts about the P4 and SSE2 in genera (Score:1)
Re:Intel's "Secret" Compiler-and the code it produ (Score:2)
Athlon vs. P4 test (Score:3)
I'm not trying to knock Intel perse. My main machine is a P3 (Dell laptop, runs like a dream). But you have to wonder if the cost warrants, in this case, the extra 3 fps in compression.
Re:Athlon vs. P4 test (Score:2)
Re:The real problem.... (Score:2)
Intel does offer their VTune compilers for sale, as they must in order to legally use them in the SPEC benchmarks where they perform so well. Unfortunately, there are widespread complaints and accusations that they are buggy and tempermental and fail to compile much code that works just fine with gcc, VS, etc. The charge that Intel gets its SPEC scores with compilers which are so optimized that they aren't robust enough for every day use has tarnished Intel's very impressive SPEC scores among some. I haven't ever tried to use VTune so I can't comment as to whether this is FUD or not. It is worth noting that VTune is much much faster than anything else in SPEC, yet rarely used in practice, so there must be something wrong with it.
But Intel does also help other compiler makers incorporate optimizations. I know they specifically work with Cygnus to optimize gcc, and would assume they do the same with MS. AMD also works with compiler makers to get support for 3DNow. (For market reasons--i.e. they will always have smaller market share--AMD designed the Athlon to perform well on P3-optimized code, and thus there is not so much to be gained by including K7 optimizations over and above 3DNow. The P4, on the other hand, is very different from both of them and needs a recompile to perform well, as these numbers demonstrate.)
Re:What this really tells us (Score:3)
This is a danger to AMD, which traditionally has had very little to do in this arena. A properly optimizing compiler can make a huge difference and they need every edge they can get to stay on top. Intel understands this weakness of AMD's and will exploit it.
Re:You only need to recompile a couple of things. (Score:2)
Re:way to inflate the statistics (Score:2)
Proved, again... (Score:2)
Sincerly I think that we have enough of these benchmark judgements. Playing the game of "the judge" is what benchamark tests should get rid of. Frankly only after a set of benchmarks is run for some time and all levels tested/contacted/patched/retested, then people should take judgement. Until then no benchmark can be taken as a veredict. So, everytime someone tells about things like Linux suxx and everything else rulez, first check if the penguin horde shrinks, then read for a month ZDNet without missing a day, then check the mass media, benchmark sites, testers, then check if freshmeat's submissions lowered, then check what your friends/colleagues/neighbors say. If everyone says that Linux still suxx then you may take for granted the first benchmark. If not then the guys have gotten a check from M$. But until then don't forget to recompile the kernel so that it fits what you really have on your comp. That's the best test benchmark you may do for yourself...
You only need to recompile a couple of things. (Score:2)
You don't need to.
For 90% of users out there who need the processing power at all, the only thing that matters is the graphics driver, because it's games that are sucking up the CPU time. Graphics driver upgrades are released fairly regularly.
For the rest, it's MPEG CODECs. I'm sure if your favourite CODEC's site posted an update that ran 50% faster, you'd download it; thus, I don't think upgrading it will be a problem.
The (relative) handful of people doing heavy-duty image processing or rendering will likewise be upgrading to the next version of their software package at some point, which will contain SSE2 code.
The OS itself doesn't need a recompile. Neither do your office applications. Where is the vast pile of software that needs to be recompiled?
Quake3 tests also have SSE2 code. (Score:4)
Urls:. html
http://www.nvidia.com/Home.nsf/nvidianews2.html
http://www.tomshardware.com/cpu/00q4/001120/p4-20
Re:The Pentium 4 Question (Score:5)
Logically, then, we must all avoid attempting any tasks that the P4 doesn't like ;P
Why? (Score:2)
but, you people love big numbers..
The real problem.... (Score:3)
Games. We want faster games. (Score:2)
But how about 3d games? That's the mass market where performance really matters. Invent a better, cheaper, BSP-tree processor and you'll corner the CPU market.
How to ru(i)n a benchmark (Score:2)
Readers: Tom, how fast is the new Pentium 4?
Tom: How fast do you want it to be?
Call me a cynic, but I doubt that even this obvious debacle will convince anyone of the pointlessness of benchmarks. (Five years of Apple saying the PPC is "twice as fast as the Pentium" sure hasn't.) Let's go back to using MIPS, at least everyone knew that stood for "Meaningless Indication of Processor Speed"
Pretty neat. (Score:3)
I also think we should take note of something. *THIS* is the promise of moving to a new architecture, beyond x86. A program compiled to be backwards-compatible right down to the 386 will NOT be able to use SSE2 instructions, nor any other fancy bells and whistles(like 3Dnow! and plain 'ol SSE). At least, as far as I can tell(I think pgcc can use more advanced instructions and still run on older CPUs).
In essence, Intel is moving away from x86, albeit slowly and painfully. SSE2 is obviously a good technology, but an incompatible one. Programs using SSE2 instructions will need those instructions available when they run, elsewise bad things happen. But what a gain!
If ever you get the chance to talk to anyone from Intel, say that you'd like to see more of this.
Dave
Barclay family motto:
Aut agere aut mori.
(Either action or death.)
I wonder if this will be a GPL test case... (Score:4)
FlaskMPEG (http://go.to/flaskmpeg) is a project written by Alberto Vigata and whose source code is available under the GPL http://www.citeweb.com/flaskmpeg/docs/gpl.html
(As an aside, Alberto has been extremely busy of late, and the project has gone a little stale, but it is by no means abandoned, and he has collaberated with several authors to forward the development of FlaskMPEG, though it is slow going)
Intel has taken this source code, produced a modified binary, and distributed that binary to a third party (Dr. Thomas Pabst). Now, the question is, where is the source code? They are obligated under the terms of the GPL to release it, and so far they havene't. Additionaly, they hint that they don't want it distributed, by asking Dr. Pabst not to make the recompiled version of FlaskMPEG available. Is this a violation of the GPL? Probably. Will Intel get away with it? I'd like to see them not, but they probably will.
I'm surprised no one commented on this before, Slasdot goers a usually more on the up than this.
P.S. The DiVX codec is *not* SSE/SEE2 or 3DNow! optimized, though it does have MMX optimizations. How do I know this for sure? Because DiVX is just a copy of the Microsoft MPEG4v3 codec that has been modified with an assembler/debugger to allow the playback of MPEG4v3 streams inside an AVI, and to stamp streams it creates with the FourCC code of DIV3 instead of MPG4. It wouldn't have been needed at all if Microsoft hadn't artificially restricted the codec from creating or playing back AVI files, instead tying it to the ASF format, and therefor to a Windows only platform. Can we say 'Embrace and Extend (just enough to break compatibility)'? I knew you could...
Re:The game's rigged! (Score:2)
Nothing can improve *more* than linearly with clock speed (which I assume when you say "CPU power"). Linear increase is the upperbound. Often the increase is lower due to memory (and other) bottlenecks.
the Athlon gets 9.28 frames per GHz, whereas the P4 gets 9.35 frames per GHz.
but it shows the basic idea here -- that most of the P4's advantage is due to its speed in MHz, not the architecture
The P4 is faster (but not by much), normalized, as you said yourself, so why is the advantage due to its speed in Mhz? You defeated your own argument.
The original purpose behind the P4 is to be able to crank up the clock speed. Looks like they have reached their goal and even increase the IPC by just a little. So it looks like this will be a win for Intel. When the Athlon reached its max clock speed, and the P4 continues to crank up its speed all the way up to 10Ghz, you'll see.
In other news (Score:4)
Meahwhile, AMD officials were quick to point out that the tests had already proven the "will of the program", which showed AMD ahead by a large margin. Intel was smug as they responded "We will wait for the final benchmark, which will show that we are the faster x86".
Transmeta was unavailable for comment.