BitSet vs boolean array

I wrote a class that could calculate prime numbers using the Sieve of Eratosthenes by 2 different methods. One method used a BitSet for the sieve, and the other used an array of boolean primitives. Both return an int array containing all the prime numbers between 1 and an int method parameter.

Ive just finished testing how fast each is with varying limits on the size of the sieze. It turns out the BitSet method is generally a bit less than twice as slow as the boolean array method. When searching for all primes between 1 and 10 million, the BitSet method took an average of 2 seconds to run, while the method using a boolean array took an average of 1.1 seconds.

A search for all primes between 1 and 100 million sees the boolean array take an average of 12.7 seconds and the BitSet 21.7 seconds. However I noticed that the VM Size listed in the processes tab in task manager showed java.exe using 230mb for the boolean array to run on this value range, while the BitSet method required only 75mb.

I then attempted to find all primes between 1 and 1 billion. Upon execution, the boolean array method dies (despite java being started with -Xmx1024m) with:

Exception in thread"main" java.lang.OutOfMemoryError: Java heap space

The BitSet method managed to pulled through though in an average of 225 seconds and a VM Size maximum of 422mb. The BitSet method could even handle a search between 1 and Integer.MAX_VALUE, taking an average of 550 seconds and a VM Size peak of 900mb.

Im curious now as to why Java can't create an array of length 1 billion even when up to 1 gig of ram is available, yet a BitSet can grow to Integer.MAX_VALUE? It seems to be quite a bit quicker to use an array, but its just not feasible at the upper end. Also, why does the array require such a significantly greater amount of memory than an equally sized BitSet? Can anyone shed some light on this for me?

Cheers,

-j

[1982 byte] By [jadedindustriesa] at [2007-10-2 3:06:06]
# 1

I think the reason for your performance results are as follows:

Java VM spec says:

The Java virtual machine does directly support boolean arrays. Its newarray instruction enables creation of boolean arrays. Arrays of type boolean are accessed and modified using the byte array instructions baload and bastore.2

In Sun's JDK releases 1.0 and 1.1, and the Java 2 SDK, Standard Edition, v1.2, boolean arrays in the Java programming language are encoded as Java virtual machine byte arrays, using 8 bits per boolean element.

So from what I understand out of this. At best you are using 1byte per boolean in a boolean array where as a Bitset is using only a bit.

So if you do the math a gig of memory is 1,024,000,000 bytes.

If you are trying to get 1,000,000,000 booleans into that space you can see where you will have problems. Given some of that space you initialized at startup is being taken by the VM and not dedicated soley to storage of your booleans. The bitset on the other hand is 1/8th the size.

This is the best I can determine.Can anyone confirm this?

jfortneya at 2007-7-15 21:32:41 > top of Java-index,Administration Tools,Sun Connection...
# 2

From the specification:

In Sun's JDK releases 1.0 and 1.1, and the Java 2 SDK, Standard Edition, v1.2, boolean arrays in the Java programming language are encoded as Java virtual machine byte arrays, using 8 bits per boolean element.

A BitSet uses long [ ] (64 bits - 6 address bits for each unit) to store bits.

Learnablea at 2007-7-15 21:32:41 > top of Java-index,Administration Tools,Sun Connection...