Apple's Mac OS X 10.5.5 "Leopard" had strong performance leads over Canonical's Ubuntu 8.10 "Intrepid Ibex" in the OpenGL performance with the integrated Intel graphics, disk benchmarking, and SQLite database in particular. Ubuntu on the other hand was leading in the compilation and BYTE Unix Benchmark. In the audio/video encoding and PHP XML tests the margins were smaller and no definitive leader had emerged. With the Java environment, Sunflow and Bork were faster in Mac OS X, but the Intrepid Ibex in SciMark 2 attacked the Leopard. These results though were all from an Apple Mac Mini.
Also worth mentioning are the collection of posts from the last thread that convincingly argued various problems with the Phoronix Benchmarks. Example 1 Example 2 Example 3
Speed tests are good, let's make sure we're doing them right
Every one of those examples are fail at reasoning weaknesses in the Phoronix Test Suite and this is why:
If you look closely you'll notice that (a) the benchmarks were run on a Thinkpad T60 laptop, and (b) there were significant differences on some benchmarks like RAM bandwidth that should have little or no OS components.
If you look closely you'll notice that (a) the laptop the benchmarks are run on effects in no way, the validity of the benchmark as long as they are run consistently on the same laptop and (b) some benchmarks like RAM bandwidth have theoretical limits that are not effected at all by the Operating System but in actual practice, is entirely limited by the operating system you are using.
Some of the benchmarks were hardware testing, and those showed variation. They should not, unless the compiler changed the algorithms used to compile the code between distros.
All of the benchmarks were testing the hardware and should have showed variation. The compilers used on all the benchmarking applications are all the same. But the compilers used to build the Operating Systems are all completely different versions. Therefore the compiler on each distro will compile the same "algorithm" slightly different way. That is assuming there were no changes between implementation of packages between distros (of which there were actually hundreds of thousands of changes in the code itself, build options, and runtime configurations)
The test suite itself: The Phoronix test suite runs on PHP. That in itself is a problem-- the slowdowns measured could most likely be *because* of differences in the distributed PHP runtimes.
The Phoronix-Test-Suite Only uses its PHP back-end to aggregate benchmarking information. If a compilation with GCC took 5 seconds, its going to take 5 seconds no matter what version of the PHP runtime is used to to start the sub-shell that GCC runs in. It's take the same amount of time if you invoked GCC from bash, from perl, python, java, tcl, C, or C++. It doesn't matter because GCC is its own process just like every other benchmark.
What exactly are they testing? The whole distro?
Yes.
The kernel?
Yes again, since that is a part of the distro
If they're testing the released kernel, then they should run static binaries that *test* the above, comparing kernel differences.
No, what wouldn't prove anything as most of the binaries with each distro are already static.
Honestly, I was unimpressed by the benchmarks. I happen to do performance benchmarking as part of my job, and I can tell you, you have to eliminate all the variables first -- isolate things to be able to say "X is slow". If you rely on a PHP runtime, use *exactly* the same PHP runtime for all your testing; otherwise, you'll get misleading results.
Wrong. You isolate it down to one independent variable, its called the scientific process. And there was only one independent variable involved, the distro. Everything else is dependant on that variable.
Avoid strange women and temporary variables.