This is a benchmark from real life based on a bootstrap method I once had to write for a statistical evaluation. The program does a lot of shuffling around in an integer array and double computations. In short words, this is a test of the Java arrays, e.g. the store and load timings of array elements, and the computational aspects of Java. It is the kind of thing that interests me most.
I have prepared a second test implementing the sieve algorithm in Java and C (actually C++ to profit from inlining). The results are below.
I have done it in C and and in Java. You can download the sources (Java, C) to try yourself and verify that the comparison is fair. Concering C, it might be said that an optimization using pointer arithmetic might improve the outcome somewhat on a Pentium system, but not on a Risc system. If you think the results on your system are interesting here, please mail them to me.
The purpose of this benchmarks is not to judge, which JVM is better. My goal was to learn how much loss one has with Java. The outcome is that it depends on the Java implementation. I consider a loss of 50% not big. On the other hand, a factor of 10 makes a JVM unusable. Even 5 may be unbareble but in some interactive implementations.
Mr Osvaldo Pinali Doederlein pointed out that the new HotSpot technology of Sun will not perform well with this code. The reason is that it uses a single method containing a loop. The compiled version of this method will only be used at the second or later call by this new generation of compilers. So I prepared a new version, using some subroutine calls.
The new version makes use of two threads to distrubute the load to multiple processors, if requested. The command line version takes the argument threads to enable multi-threading. The applet has a checkbox for this purpose.
If threads are disabled, the computation will take place in the event loop. This is not the correct way to handle longer computations. But there is a danger that background threads run with a lower priority. Indeed in Windows Explorer, I found that two threads are far slower than one single thread. The only explanation I have is that the background jobs are interrupted by other tasks.
I want to stress that this benchmark does not test the object oriented (OO) side of Java. Abuse of object creation can slow down any program. If you try to create several thousand objects on the heap, you must not be surprised about a speed loss. This problem is common to Java and say C++. However, this does only speek against OO in really extreme cases. The other problem is the inherent loss by calling virtual functions all the time. The program does not check this very well. My guess is that JIT compilers and inlining will help Java to get on par with C++ in this respect.
Neither does it test the GUI part of Java. I find that those tests are difficult to perform, since so much depends on programming skills. Especially and overloaded and badly designed GUI interface may easily become to slow to be useful. Not using double buffering is another source of complaints.
For me, it was surprising how well the JIT-compiler by Sun performes on my Windows system. This is perfectly in accordance with another program I have for both Java and C. Also it is surprising how slow Java 1.0.2 was. It also shows that "your mileage may vary".
The times on Unix version are user times in an otherwise unused system, not counting the loading time of the VM. The DEC implemetation of Java seems to be unusable. I now removed some timings, which were as expected, because they can be computed by comparing with a similar system at a different clock rate.
|
System |
Runtime |
Time |
|---|---|---|
|
Pentium 166, Win95 |
Borland C++ 5.01 |
84 |
|
|
JDK 1.1.7 |
100 |
|
|
IE 4.0 |
130 |
|
|
Netscape 4.5 |
209 |
|
|
Java 1.1.7 (-nojit) |
490 |
|
|
Java 1.0.2 |
1164 |
|
DEC Alpha 255 Unix |
cc -O |
120 |
|
|
java 1.1.4 (-jit) |
3200 |
|
|
Netscape 4.05 |
1464 |
|
Mac G3 266 |
MRJ 2.0 |
110 |
|
PowerMac 7600/132 |
Code Worrier C |
38 |
|
|
MRJ 2.1ea3 |
83 |
|
|
Netscape 4.5 |
472 |
|
Linux, Pentium 166 |
cc -O |
65 |
|
|
IBM JDK 1.1.8 |
95 |
|
|
JDK 1.1.7 |
480 |
|
|
kaffe 1.00 |
372 |
|
|
tya 1.1v4 |
235 |
|
|
Netscape 4.5 |
1153 |
|
PII 233, NT |
gcc -O |
31 |
|
|
JDK 1.1.7 |
47 |
|
|
IE 4.0 |
57 |
|
PII 233, OS/2 |
Java 1.1.7A IBM |
41 |
|
|
java -nojit |
348 |
|
|
Netscape 4.04 |
41 |
|
Sun Ultra 60, 360 MHz |
cc -x04 |
30 |
|
|
JDK 1.2, Prod.Rel. for Solaris |
45 |
You can try the benchmark yourself with your browser by pressing Compute in the applet below.
To get more confidence into Java, I implemented a second algorithm in Java and C. Actually, I used C++ to profit from inlined functions and the class syntax. The port is rather direct. Just look at the Java code and the C code yourself. The following results are consistent with the results in the previous test.
The only problem is with Sun's BitSet class. It is really slow. The reason is synchonization.
|
K6-2/300, 64 MB, Windows 95 |
Java 1.1.7B using own BitSet |
24 sec |
|
|
Java 1.1.7B using Sun's BitSet |
(!!!) 244 sec |
|
|
Borland C++ 5.01, fully optimized |
18 sec |