BitSet vs boolean array
I wrote a class that could calculate prime numbers using the Sieve of Eratosthenes by 2 different methods. One method used a BitSet for the sieve, and the other used an array of boolean primitives. Both return an int array containing all the prime numbers between 1 and an int method parameter.
Ive just finished testing how fast each is with varying limits on the size of the sieze. It turns out the BitSet method is generally a bit less than twice as slow as the boolean array method. When searching for all primes between 1 and 10 million, the BitSet method took an average of 2 seconds to run, while the method using a boolean array took an average of 1.1 seconds.
A search for all primes between 1 and 100 million sees the boolean array take an average of 12.7 seconds and the BitSet 21.7 seconds. However I noticed that the VM Size listed in the processes tab in task manager showed java.exe using 230mb for the boolean array to run on this value range, while the BitSet method required only 75mb.
I then attempted to find all primes between 1 and 1 billion. Upon execution, the boolean array method dies (despite java being started with -Xmx1024m) with:
Exception in thread"main" java.lang.OutOfMemoryError: Java heap space
The BitSet method managed to pulled through though in an average of 225 seconds and a VM Size maximum of 422mb. The BitSet method could even handle a search between 1 and Integer.MAX_VALUE, taking an average of 550 seconds and a VM Size peak of 900mb.
Im curious now as to why Java can't create an array of length 1 billion even when up to 1 gig of ram is available, yet a BitSet can grow to Integer.MAX_VALUE? It seems to be quite a bit quicker to use an array, but its just not feasible at the upper end. Also, why does the array require such a significantly greater amount of memory than an equally sized BitSet? Can anyone shed some light on this for me?
Cheers,
-j

