Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
User Journal

Journal Journal: Java and C++ optimisation, and GCJ

I've been trying to compile gcj on my Mac (having had to use my home linux box at work recently, I'm using the Mac as my home 'PC'. Actually I kind of like it :-)... anyway while waiting for gcj to compile, I was browsing looking for articles on it. I found 2 of interest:

  • An IBM article on how gcj didn't really compare to the IBM VM...
  • A slashdot discussion on the above IBM article

Now what surprised me was just how badly gcj was doing on the benchmarks he'd written - even *if* (and I make no accusations, just note that IBM's VM won...) it was a PR piece dressed up as an article. I decided to check out the performance on a linux box I could ssh to...

Here's the java code: (slightly edited to look better in slashcode)

import java.io.*;
class prime
  {
  private static boolean isPrime(int i)
      {
      for(long test = 2; test < i; test++)
        if (i % test == 0)
            return false;
      return true;
      }
 
  public static void main(String[] args) throws IOException
      {
      long start = System.currentTimeMillis();
      long n_loops = 50000;
      long n_primes = 0;
 
      for(int i = 0; i < n_loops; i++)
        if(isPrime(i))
            n_primes++;
 
      long end = System.currentTimeMillis();
      System.out.println(n_primes + " primes found");
      System.out.println("Time taken = " + (end - start));
      }
  }

First off, this is a truly awful algorithm for finding primes, but it's the code he provided... In any event it certainly tests loops a lot [grin]. The author didn't provide a comparable C/C++ program so here's one I prepared earlier:

#include <stdio.h>
#include <sys/time.h>
 
# define timersub(a, b, result) \
  do { \
    (result)->tv_sec = (a)->tv_sec - (b)->tv_sec; \
    (result)->tv_usec = (a)->tv_usec - (b)->tv_usec; \
    if ((result)->tv_usec < 0) { \
      --(result)->tv_sec; \
      (result)->tv_usec += 1000000; \
    } \
  } while (0)
 
static int isPrime(long i)
    {
    for (long test=2; test<i; test++)
        if (i%test == 0)
            return false;
 
    return true;
    }
 
int main(int argc, char **argv)
    {
    struct timeval stt,end, dt;
 
    gettimeofday(&stt, NULL);
    long n_loops = 50000;
    long n_primes = 0;
 
    for (long i=0; i<n_loops; i++)
        if (isPrime(i))
            n_primes ++;
 
    gettimeofday(&end, NULL);
    timersub(&end, &stt, &dt);
    printf("Time taken: %d.%06d secs\n", dt.tv_sec, dt.tv_usec);
    printf("Primes : %d\n",n_primes);
    }

... which is pretty much as direct a copy of the java version as I can make. The programs were both compiled using -O3 and run, vis:

[simon@cyclops /tmp]% gcj --main=prime -O3 prime.java -o prime_j
[simon@cyclops /tmp]% ./prime_j
5135 primes found
Time taken = 15095
 
[simon@cyclops /tmp]% g++ -O3 prime.cc -o prime_cc
[simon@cyclops /tmp]% ./prime_cc
Time taken: 7.060192 secs
Primes : 5135

Which would appear to indicate that the java code is approximately 50% of the speed of the C++ code. BUT (you knew there was a 'but', right ?) gcj is notoriously bad at optimising long integers. I suspect it actually does the top 32 bits, then the bottom 32 bits, then combines the results... If we change all occurrences of 'long' to 'int' in the arithmetic (not the time variables), we get very different results:

[simon@cyclops /tmp]% cp prime_int.java prime.java
[simon@cyclops /tmp]% gcj --main=prime -O3 prime.java -o prime_j
[simon@cyclops /tmp]% ./prime_j
5135 primes found
Time taken = 7061
 
[simon@cyclops /tmp]% cp prime_int.cc prime.cc
[simon@cyclops /tmp]% g++ -O3 prime.cc -o prime_cc
[simon@cyclops /tmp]% ./prime_cc
Time taken: 7.061838 secs
Primes : 5135

So, when you use 'int' variables, gcj is pretty much as good as g++ for this benchmark. What does this prove ? Not very much, apart from you should always take published figures with a pinch of salt when someone has a vested interest, and that the IBM's VM 64-bit maths is better than GCJ's...

It just irked me that an entire article could be based on something so simple. I've always been reasonably impressed with the speed of GCJ, but perhaps that's because I tend to use 'int's in my loop variables rather than 'long's. I can't quite rid myself of the suspicion that the IBM author was making cheap capital out of a small thing, as well...

I'm a great fan of Java (and compiled java). I find it a lot easier to write programs in, and far and away easier to maintain. I've got a fighting chance of opening up a colleagues JBuilder project and understanding what they've done (even though my colleagues tend to regard comments as optional, [sigh]). In C++ I have to worry a lot more about memory allocation - mainly in terms of the policy for release of objects and their private/protected data. This can truly be a nightmare :-(

IMHO one of the real 'wins' of GCJ is how easy it is to extend it with native bindings. I ported the JSDL SDL bindings for Java to gcjsdl in a matter of days because CNI is *much* nicer to work with than JNI.

I think it's fair to say that my compiled language of choice is now Java, with C++ as needed to bind external libraries. I think that says it all...

Simon

User Journal

Journal Journal: Anti organised-religion rant 2

[Well this started off as part of a comment on that Ashcroft nutter, but on reflection I decided to remove it from the mainstream post; it wasn't really relevant.]

It's interesting how western religion (an artifical social-control hierarchy) _almost_ always teaches that sex is taboo, to be limited and used to manipulate social behaviour in ways that further that social-control, sorry religion.

These religions tend to promise wonderful results (once you're dead!) or terrible suffering (again, once you're dead!) if you do/don't do what that 'authority' says as well... No easy way to argue against that without dying for the cause, which is a bit extreme when you're agnostic or atheist :-)

Think about it: if some book claimed that some bloke in Israel had risen from the dead 200 years ago, would you believe it ? Oh yeah, he was born from a virgin (riiiight!) and can walk on water too, not to mention he has a built-in replicator, though it's stuck on prawn sandwiches. Yes ? No ? If yes, well, all the more power to your elbow my fine delusional friend. Get out of your beliefs everything you can. If no, you either ascribe too much validity to the fact that the bible is *old*, or you'll agree with me that 'religious' people are just nutters.

There's nothing wrong per-se about being a nutter. A lot of successful people were completely nuts. The saying that there's a fine line between genius and insanity is quite a deep reflection on what being nuts is all about, IMHO, though I doubt it was meant that way when first coined.

What can be scary is when nutters try to change you, try to coerce you into their belief system (and I don't just mean religious beliefs here). When such a nutter holds high office, it gets serious - there's not much worse than a motivated nutter with power.

Overall, despite my aversion to it, I think religion does more good than harm. I think it gives people with little else to live for, a reason to live: to be good, honest, kind, the standard virtues. Western religious values have also been the foundation for most of the human rights we now take for granted. Perhaps it's just a phase that a society has to go through - a bit like all the spots that teenage males get at puberty...

Religion is also a bit like heroin (or, opium in a different century, I guess :-) Society wouldn't recover easily from the sudden removal of its dependence on religion as a crutch in life. Excising any cancer takes time, it needs to be gradual, and for religion the process is underway. I can live with that.

You're born. You live. You die. That's it. You should be perfectly happy with that. I am. Consider the alternative.

Simon.

User Journal

Journal Journal: Journals as links and hostip.info

So, my first /. journal entry. Momentous occasion, perhaps. Perhaps not. What prompted it was the large number of people visiting hostip.info from my (allegedly non-existant) journal page... So I thought I'd put something in here :-)

Hostip itself started off as a "wouldn't this be cool" idea, and a first version was born. The 'individual privacy' minded will have a field-day with this, but the inspiration actually came from watching 'Enemy of the State' on a '747 flight :-) I wanted to do (in a very limited way, of course) something similar using the web. As always in projects like this, it's the data that's the hard part of the equation, not the coding...

This first version allowed people to type in new cities, and it would auto-associate with their IP address. This was (as I should have forseen) a complete disaster. The number of Martians living here on Earth is truly amazing. We apparently even play host to a couple of Alpha Centaurites; to these fine beings I say 'Welcome to Earth' in "Will Smith" fashion. (Yes, I'm a fan...)

Once it was clear that if bad data was trivial to enter, it would indeed be entered, I raised the bar a little. Now you can only choose cities that already exist (and which I have latitude and longitude for), or email me with the details of a previously-unknown city, and I'll check it out before entering it into the DB. This has made the database more useful... Needless to say, cleansing all the bad data from the DB was a monumental task. It literally took weeks, and if I'd known at the start how long it would take, I'd not have started it!

It's still possible to lie to the machine of course (and I dare say lots do, on purpose, simply because it's their principle to do so). I have in my own way tried to get around that - the DB keeps a track history of assignments to a /24 netblock (that's the smallest unit it tracks), and since you can only reassign your own IP address, as soon as 2 others on your netblock tell the truth about where you are, it will switch to the real location... It's certainly not foolproof (hell I can think of a half-dozen ways around it!) but it raises the bar...

Up until this point, hostip was a purely text-based system. Next came the map data. I got in touch with the US National Geophysical Data Center in Boulder, and asked them for the highest-resolution data they had. That turned out to be 30 arc-second elevation data for the entire planet. Wow! So I spent some time writing tools to efficiently extract the correct data and colour it nicely/correctly for the small maps I needed - this took a week or so... Just loading the data into RAM took a lot of time (eventually I remembered mmap() and things went a *lot* smoother!).

The dataset consists of a 43000x21500 image, at approximately 1km/pixel, taking ~2.6Gbytes to store. Even things like ppmtogif can't handle that much data :-( The current database size (from du -sk on the mysql db directory) is 623Mbytes. All this needed to be correlated together before the applet started to look even vaguely reasonable. It still has lots of errors (mostly where I have the decimal point wrong in latitude or longitude figures :-( but it's useful now, and I tend to get told [grin] when something is wrong...

One of the reasons I wanted to do this (apart from the obvious coolness of the idea :-) is to give something back to the people who've given me so much 'free' software over the years. Those from this nameless multitude, I salute you - I hope you get as much out of hostip as I got from your various projects/programs.

I happen to think the applet is (even though I wrote it myself, [grin]) one of the coolest ones I've seen so far, although that may be down to knowing just how much hard work went into making it [big grin].

I have more plans for hostip, but perhaps I'll leave them for another journal entry...

Simon.

Slashdot Top Deals

"Everything should be made as simple as possible, but not simpler." -- Albert Einstein

Working...