Slashdot Log In
C Faces Java In Performance Tests
Posted by
timothy
on Sat Jun 03, 2000 07:12 AM
from the who'dathunk-it? dept.
from the who'dathunk-it? dept.
catseye_95051 writes "An independent programmer in the UK (he does not work for any of the Java VM makers) has done a series of direct C vs. Java speed comparisons. The Results are likely to surprise some of the Java skeptics out there. " Author Chris Rijk admits, "This article is not supposed to be an attempt to fully quantify the speed difference between Java and C - it's too simple and incomplete for that," but the results are nonetheless food for thought.
This discussion has been archived.
No new comments can be posted.
C Faces Java In Performance Tests
|
Log In/Create an Account
| Top
| 302 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Is the max-C really the best set of optimizations? (Score:4)
Likewise, he doesn't use the -fprofile-arcs or -fbranch-probabilities options which would probably speed up some of the code quite a bit, I would imagine.
This is a great discussion (Score:3)
If the next generation of programmers are as inflexible and intolerant as the
Re:Optimal FFT was not the point (Score:3)
/* This one is not optimized and looks like the one for the FFT */
double prod(double *x, double *y, int len)
{
double sum=0;
int i;
for (i=0;ilen;i++)
sum += x[i]*y[i];
return sum;
}
/*This function in an optimized version of the previous */
double prod(double *x, double *y, int len)
{
double sum1=0, sum2=0, sum3=0, sum4=0;
double *end = x + len
while (x end-3)
{
sum1 += *x++ * *y++;
sum2 += *x++ * *y++;
sum3 += *x++ * *y++;
sum4 += *x++ * *y++;
}
while (xend)
sum1 += *x++ * *y++;
return sum1+sum2+sum3+sum4;
}
I have recently used this optimization on my code and found a performance increase of about a factor of 3 (on a Athlon 500). The fist version of the function has three problems:
1) The loop overhead is very expensive compared to the 2 float operations inside it.
2) The indexing is ofter more expensive than simple pointer increment (thought not always).
3) This one is the most important. In the first example, each sum (+=) requires the previous result of the sum to compute. Now, the problem is that the FP ADD pipeline is stalled. The time it takes for each addition is no more one cycle, but the length of the pipeline. In the second example, the use of multiple partial sums prevent that.
Also, as I said before, a good FFT coded in C can be as fast as the processor is. The Java code can be as fast if it is good, but not twice faster, as in the benchmarks. But because, the FFT code wasn't optimized, the performance difference likely came from the loop overhead.
Give me a Java COMPILER! (Score:3)
Java as a language is fine. But Java on a VM just doesn't cut it for real-world apps.
I am currently developing a product series of WAP servers and gateways. A few competitors have chosen Java and they can support a maximum of 500 simultaneous users, and that with at least 256 MB RAM and 2 to 4 Pentium III 600 MHz+ processors.The C versions of similar products have no problem supporting several thousand users on a single processor 64 MB machine. Java just ain't in the ballpark.(I should also point out that the Java VM's are not very stable and crash frequently.)
Java's specs as a language are really nice. Why don't we leave this VM stuff to the specialty apps that need it and start using Java as a COMPILED language?
Java excecution speed actually good (Score:4)
Most Java VM's are quite good at executing Java code, so the results are not all that surprising.
Java's biggest problem is in memory requirements. Metadata for classes is frequently much larger in size than both bytecodes and allocated objects. This needs to improve if Java is to become a more mainstream language.
Benchmarks misleading - Java vs C (Score:4)
1. Medium to big Java apps need 128mb-256mb system RAM to be useable. HotSpot increases memory footprint (uses memory for compiler + bytecode and native code is in memory), but does not enhance every type of app. HotSpot looks great on many benchmarks (small loop intensive apps tested on systems with much memory), but for many apps it slows things down.
2. By pre-running the code for 1 second to get how many iterations to use for 10 seconds, you're making sure that hotspot and JITs fully kick in, without countin any of their execution overhead.
3. Contrary to what you might expect, there's no UI in the game of Life benchmark.
4. The benchmarks are set up to favour run-time optimizations by having function parameters that are constant for long periods of time (ie matrix size)
Java is just fine when you have tons of memory, but if your users have 64mb or less, go with vb or cpp.
The benchmarks in the article have completely avoided any JVM/hotspot initialization overhead, as well as sticking to things JIT compilers are good at.
Java chip (Score:3)
Re:Java excecution speed actually good (Score:3)
* C was given the choice of two not-very-good compilers: GCC and MSVC. From experience, I have seen the same code (especially math or array-intensive code) execute an order of magnitude faster when compiled with Kai CC or Portland Group CC. OTOH, Java was using the top of the line compilers and JVM (e.g. MS's JVM is well known to be much faster than even Sun's in Solaris...)
* Java had the advantage of run-time optimization. If you go to Ars Technica and read up on HP's Dynamo, you'll see how run-time optimization *alone* can give you a 15-20% improvement in speed with *compiled* binaries. Granted, run-time optimization is 'in the box' for the Java platform while, besides Dynamo, C/C++ are stuck without it.
Even if you dismiss the run-time optimization advantage as an integral part of the test, the choice of compilers *did* have a speed effect...
At any rate, I *am* a Java fan --I am just curious to see some true, fair benchmarks.
engineers never lie; we just approximate the truth.
I don't like Java (Score:4)
First, I wouldn't say it has everything one could want in an OOP language. The language feels like watered-down C++: templates (and STL), objects on the stack, const, references, and true multiple inheritance, are all missing from Java, but clearly would be useful. Yes, the absence of these features makes life easier for beginners, but it's painful to work around Java's deficiencies when you know how to use such features.
Unfortunately, Java isn't really multiplatform, either, unlike what Sun's marketing team would have you believe. Java is multiplatform in the same way that my Super Nintendo ROM is: I can play it on windows, linux, solaris, etc. I need an emulator, of course. Similarly, I need a "Java Virtual Machine" to run my Java bytecode: it's really just an emulator for a platform that doesn't exist. And if the emulator isn't ported to your favorite platform, well, tough.
But the main thing I don't like about Java is how gratuitously integrated it is. Why should the Java standard library (which is really a platform in itself) be inextricably bound with the Java language? It could easily have been made into a C++ library, since C++ has direct support for all the language features of Java. Then, they could have written the Java language/bytecode interpreter separately, and made it an option to use the Java platform. This would clearly be better for everyone (except Sun): I could use the well-designed Java APIs in my C++ project with no loss of speed.
The same thing goes for much of Java. Why does Javadoc, a program that generates documentation from comments in your code, have to be integrated with the rest of Java? It could also be used to document C/C++ code, with minor modifications.
IOW, Sun is trying to lock you into their platform in the same way that MS is with their strange APIs ; except that Sun's method is much more effective. I am sticking with C++.
My Criticisms of Java... (Score:3)
...never focused on the fact that it is slow. It was always the fact that they had to go and invent another language that isn't any better than C or C++. I'm all in favor of cross-platform development, but forcing us all to maintain yet another codebase, in yet another language is just a royal pain.
If Sun had produced a platform independant C/C++ environment... wow... just imagine. We may never know how good it could have been.
Kernel times (Score:4)
The results of these numeric tests surprised me, but I'd like to have seen Watcom/Borland C compilers used, as both have a reputation for superior numeric code generation to Microsoft's Visual C++ product and GCC.
What I think... (Score:5)
One of the easiest languages to learn (provided that you understand OOP). I tried C++, and I failed (for now). I tried Java, and it is very easy for me. And for the ease of learning, it gives me immense power. Everything anyone could ever want in a true OOP language.
It is also multiplatform... we all know about that.
The only language I can think of that comes close is VB. VB is Windows-only, (well, you have VBA in Office98 on MacOS, OK) and it doesn't give you much of OOP (inheritance, etc.).
Finally, there are a lot of people out there that will learn a language simply because it's in demand, so that they can get a lot of money paid for writing things in it, and Java wins here as well. Just go do a search on Dice.
The only thing that bothers me is that Java is now definitely being controlled by a corporation. I'm pretty glad it's not Microsoft, but I'd still rather have it controlled by an unbiased group. OTOH, without Sun's promotion and development, who knows if Java would ever rise to where it is today.
Let's just hope that the damn applets will fade out... I just hate them! Please correct me anywhere you think I'm wrong - that's what the Reply link is for.
--
Speed and server applications (Score:3)
Volano benchmark (Score:4)
You will see there that the best VM is Tower TowerJ 3.1.4 for Linux !
Second point, I never doubt that java on the server is a good solution now. For me, the only trouble with java now is the memory gluttony.
If some of you want to test Jsp/Servlet, here are some good open source products : java.apache.org (JSP, servlet), www.enhydra.org (JSP, servlet, EJB)
Re:Kernel times (Score:3)
I remember writing a small test program in c and identically in Java (a couple of syntax changes only). Using IBM's jdk with -O made it faster than pgcc with all the optimization flags we could think of!
Then I rewrote the program in an OO-way in Java, and of course it was slow:)... but it does show that Java isn't nessecarily slower than C for some tasks.
Re:Speed and server applications (Score:3)
Memory allocation (Score:4)
It's a bit unfair to blame gcc for poor memory allocation: unlike Java memory allocation isn't built into C.
Re:Too good to be true (Score:5)
As far as getting the latest JDK in anything but Windoze, you can currently get Java2 v1.3 in Windoze, Solaris and Linux (with other ports on the way). The fact that they came out with the Windoze port first should be no real surprise to anyone: most folks are still using Windoze, hence there is more demand for upgrades on this OS than any other.
I've written Java stand-alone apps that are monumental in size and I've written Java server-based apps. I think that Java's main glory lies in server-side programming for web-enabled applications, but it is no slouch in the large stand-alone application market. You keep hearing people complain that Java eats up so much memory when all you want is a simple Notepad app. You need to understand what Java is doing and learn to work with/around it.
If you load a large app that utilizes many of the Swing widgets and interfaces, the memory load becomes a bit more understandable. On the large apps that I've written for Java, it has actually performed quite well (sub-second sorting and display on a 10K row table, etc).
Most of the comments that I see bashing Java are from people that have only taken a cursory pass at the language. If you try to code a Swing interface using the same paradigms as AWT (or C, C++, etc), you'll wind up with a slow monstrosity. If you code Swing the way it was intended to be coded (using the MVC architecture), you'll find that it's a remarkably powerful and full-featured GUI API.
At any rate...I'll get off my soapbox now. I really don't mean to tout Java as the be-all end-all of programming languages (it's not). But it is one of the better languages out there for the current direction of Internet-enabled programming.
Re:I don't like Java (Score:3)
The final keyword in Java also serves the purspose of constant declarator for variables. This
"In java, in addition to Object, you need a classloader to load your code. And you get no strings, because the compiler has specific syntactic sugar for dealing with strings (why oh why did they hack this into the compiler isntead of just supporting operator overloading?) I'm sure there are more, but its been a while since I hacked java at that low a level."
If you are griping that VMs are too dependent on the Java language itself...well, tough. Sun created the VM spec FOR the Java language...not as some universal VM and byte code instruction set that anybody could write any arbitrary language on top of (although that is certainly possible). People write Java in Java...they don't write in bytecode, against the VM directly.
Also, somebody was griping that the standard libraries and default utilites from the Sun JDK were written in Java. Well...DUH. They are Java libraries and tools. They should be written in Java. Java was created for easy cross-platform development...it would be stupid then to write all the libraries and tools in some native language, then have to ALSO port all those on top of writing a VM. With the libraries and tools written in Java itself, all VM writers have to worry about is writing a VM, and presto, the support libraries and tools are all magically available. It would be hypocritical and tragically stupid to write the supporting tools and libraries for platform-neutral Java in some platform-dependent language.
Dynamic Optimization (Score:3)
Nevertheless i think dynamic optimizations are the thing to come: it costs a lot of man hours to find ideal optimizations to code, (you need to figure out the core routines, think about which optimizations make most sense for the current architecture, check those assumptions against reality) and man-hours, in contrast to cpu-time, don't become much cheaper. The dynamic optimizer does all that work for you, and even optimizes for different starting conditions/parameters by looking at what is *really* taking time now.
Look at the success (regarding computing power per bucks) of transmetas crusoe. A dynamic optimizer can gather far more hints for optimisations (branch predictions, loop length, array sizes, memory lookups) than a static one, in the latter case the programmer has to give all the hints (compile a subroutine with the correct set of optimisations, sort the loops right, sort branches, keep in mind some ranges for parameters and how they affect loop length, for some compilers throw in compiler directives, etc.) and even has to reconsider when porting to another architecture.
So with static optimizations it's either optimization limited to the part the compiler can see at compiletime (except for very basic stuff, every decent compiler will get that matrix multiplication right) or man-hour intensive and thus costly optimization.
Re:Kernel times (Score:3)
One area in which C does not offer significant benefits over Java is in the area of network server programming
I agree, but with one exception: Java does not do non-blocking I/O. Therefore you have to use tons of threads, at least one, likely two, for each connection. For a server handling thousands of connections, you can see where this gets out of hand. In Linux's case where all threads are kernel level threads, performance is back in the shitter since it has to make a new set of system calls to manage the threads. But of course, you want to use native threads so that you can take advantage of multiple processors.
If non-blocking I/O were possible, one thread and a huge select is all that is needed. Squid is a good example of a server that can handle thousands of connections using one thread. The cost here is complexity, but the reward is performance.
I'm not advocating non-blocking I/O. I think Java's approach makes for much simpler and more stable servers, but JVMs must make threading as lightweight as possible while still supporting SMP for performance to compete with C-based servers. I think this means supporting a mix of kernel and user level threads.
It's all about optimization (Score:5)
What makes me even more suspicious is that I have a K7-500 too and I have done some tests with a heavily optimized FFT (fftw) and I get a performance around 400 mflops. There's just no way a JVM can be 220% faster than that. So my comclusion is "with poor code and poor optimization, Java can be faster than C".
I don't want to take position of the whole Java vs C speed, but what I'm saying is that at least his FFT test is flawed.
Re:Java excecution speed actually good (Score:3)
No Java VM doesn't have any defined tome to collect. Diffrent JVMs do it at diffrent times.
Most have a low pri thread that will GC either whenever it is runable (for GCs that are intrruptable), or when it has waited "long enough", or when "enough" memory is allocated. All JVMs I know of also GC when they are out of memory. Except for the Java subsets that don't GC at all (like SmartCard Java).
The one time Java doesn't GC as a rule is "when you free something", because it doesn't know when you free anything. There is no free operator. You can assign a pointer NULL, but that won't free the object unless that is the last pointer to it! Running a GC on every NULL (or other!) pointer assignment would be staggeringly expensave. Java could keep a reference count hint with each object (like Limbo does), but that has it's own problems (and advantages). I don't know of any JVM that does.
Re:It's all about optimization (Score:3)
A good example of this is Perl programmers moving to Python. If you do things in Python the "Perl way", typically, it performs rather poorly. Likewise, if you do it the Python way, it performs on par with Perl.
This highlights the fact that no optimizer can replace a good programmer with good design skills. It does, perhaps, highlight that perhaps you can get reasonable results with less skilled programmers using Java than you might get with C, or especially C++.
Re:Java excecution speed actually good (Score:3)
Generational Garbage Collectors try to run a GC sweep after every X K of allocs (where X is about the size of the cache). They are quite a bit faster then most GCs that only collect when memory is critically low (memory accesses in cache are an order of magnitude faster then out of cache references). The downside is they GC more frequently, so rather then being "fast" for five minutes, and then a short pause, they nick you for a few 100ms all the damn time. Of course that is also an upside, they don't feel jerky.
Generational GC also tends to pack items allocated at the same time (and still live) together, which for many programs increases the locality of reference, and helps a whole hell of a lot if the system is paging.
I don't know how many JVMs use generational GC, but since it is a 70s Lisp technology, I can't imagine they use something worse. GC hasn't been a red hot research area, but it has had a lot of good work done in the last 20 years (and a lot more before that!)
I do know many JVMs run GC sweeps periodically if there is idle CPU (get a Java profiler, and check out the activity in the GC thread).
What the original post was complaining about was the "overhead" with each object. I'm not convinced that exists. I know every object has the equivalent of a C++ vptr (four bytes). Every object type has a virtual function table (possibly shared if it doesn't define new functions, or override any of the parent's), and a small description of the data fields, and the name of the type, and the function names and such.
That's a lot of crap -- say 400 bytes. Easily more then a simple structure (say a 2D point, 2 4 byte ints). If you have one point object, you probably have 100 times as much memory dedicated to describing the point object (in case you need to serialize it and send it via an RPC or something). But if you have 5000 points, the overhead of the meta data is vanishing low (400 bytes out of 40,400 bytes, 1%). Er, except for the vptrs (4 bytes on most systems), that'll bring it up to 20,400 overhead bytes for 60,400 total bytes or about 33%.
So for very simple objects Java does have a noticeable overhead. But for less simple objects the overhead is much smaller. If you compare any Java class with a C++ class that has virtual functions the per-instance overhead is identical. The per class overhead is different (with Java almost certainly having more overhead), but the per class overhead isn't interesting. There are not enough classes in most programs to make a difference (and believe it or not, with templates it's far easier to "accidentally" make 1000s of different classes in C++ then in Java).
That leaves arrays. C/C++ arrays need not have a length stored with them while Java ones do. Java is behind 4 bytes per array on that score. Relevant if you have lots of small arrays, irrelevant otherwise. Except....
You know C++ does need to know how many elements are in an array so it can call the destructor for each one (it can omit this, if there is no destructor, but I don't know of compilers that do that). So it doesn't beat Java, it ties.
...Oh, and C needs the length for dynamically allocated arrays (via malloc) so it can free them again. But it does win on static arrays.
Pretty much all of them if you make a thread that calls java.lang.runtime.gc() and then sleeps for a few seconds in a loop. Or even most of them (I think) if you merely have some idle CPU.
Re:Kernel times (Score:3)
In my opinion making a program complex for performance reasons only is a bad idea unless we're talking about long term usage of a program. Developers often forget that most of the cost of developing a product goes into maintenance. Java servlets provide you with a nice scalable architecture for serverside programming and it allows you to focus on the parts of the program that provide the functionality you need rather than performance related stuff.
Message from the author (Score:4)
Another post I sent in last night which quickly got rejected was this:
Unfortunately, that release came a little too late for me to do much about, though I have quickly tested the Solaris x86 (on the same hardware as the Windows tests), and the rests are pretty much identical, though Solaris was a bit faster. (but then, I was running without the desktop running which does help).
Also coming a bit too late was results from IBM's Windows 1.2.2 JDK, which I found a bit surprising - it did worse on some tests, and better on others, though I didn't have much time to test things.
Thanks for the replies... kinda makes it all worth it - it took me about 100 hours over 4 weeks to do this. (took up a lot of my evenings)
I better re-install Linux sometime so I can test on it again... (my last install stopped working for unknown reasons)
It'll probably be some time before I update the article - first I want to finish off my MAJC article, which really is too damn big. (22,000 words... ouch).
Re:Memory allocation (Score:3)
C++'s new tends to be a thin layer over malloc. The STL allicator wasn't designed to be faster then new/malloc, but to deal with segment issues, shared memory issues, and maybe even object perminance.
The allicator SGI's STL (which gcc currently ships) allocates about 20 objects when you ask it to allocate, and doles them out one by one. That's for "small" objects. Anything over about half a K (or maybe 2K? I forget) goes through to new. This might be faster then 20 mallocs. Or not. Some mallocs are pretty good. Better then the STL allicator overhead. It does tend to reduce fragmentation. By a whole lot more then I expected.
If you don't like the default allicator, they are easy to write, and Alloc_malloc is allways ready to step in. There is even a #def to ask it to be used everywhere.