Trie of Integer literals as a way to map an int to an Object
I'm trying to create a bit-level trie using integers for a memory constrained application
For some reason I think that using such a Trie will be better (faster and less memory) than a HashTable with Integer objects mapping Objects.
Is this intuition right or should I just use a Hashtable?
Thanks
Mordan
[335 byte] By [
Mordana] at [2007-11-26 21:45:30]

# 2
I meant for speed - try it with your loadings.
Each node in the trie has three members:public class BitTrie <T> {
final Node<T> root = new Node<T>();
static class Node <U> {
Node<U> zero;
U value;
Node<U> one;
}
...
}
Treating the key as an unsigned 32 bit value, there is at least one node per value, and at most the same number as the maximum key value. For a fully dense trie, ie there are N values and the keys go from 0..N-1, there are N nodes.
In the hash map, the entries have four members:static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
Entry<K,V> next;
final int hash;
}
In addition, there is an array of entries which has size 2^ceiling(log_2(N/0.75)). The size is unaffected by the range of key values. If you have key values outside of range of the autoboxing cache, these also consume memory.
So if you have N nodes and N/density key range, assuming 2 words per object and one word per member you get these sizes:
density % 110255075100
size
5500502010 6 5 keys
2500250100503025 trie
383838383838 hashmap
10100010040201310 keys
50005002001006550 trie
767676767676 hashmap
5050005002001006650 keys
2500025001000500330250 trie
428428428428428428 hashmap
100100001000400200133100 keys
50000500020001000665500 trie
856856856856856856 hashmap
1000100000100004000200013331000 keys
50000050000200001000066655000 trie
804880488048804880488048 hashmap
10000100000010000040000200001333310000 keys
50000005000002000001000006666550000 trie
763847638476384763847638476384 hashmap
100000 100000001000000400000200000133333100000 keys
50000000500000020000001000000666665500000 trie
862144862144862144862144862144862144 hashmap
1000000 100000000 100000004000000200000013333331000000 keys
500000000 50000000 20000000 1000000066666655000000 trie
809715280971528097152809715280971528097152 hashmap
These overestimate the size of the trie for low densities, but even with a fully packed trie the size savings are not massive.
Pete
# 3
Using randomly filling the Trie to the given size, and also comparing with a Vlist, gives the following results:
density % 110255075100
size
5500502010 6 5 keys
1457575404535 trie
383838383838 hashmap
56610062402626 vlist
5050005002001006650 keys
1710940715525425405 trie
428428428428428428 hashmap
8270566304170170100 vlist
1000100000100004000200013331000 keys
3466519500137401056591707555 trie
804880488048804880488048 hashmap
131174164684168211421141084 vlist
1000000 100000000 100000004000000200000013333331000000 keys
34509140 19333755 13894290 1049758591864707621435 trie
809715280971528097152809715280971528097152 hashmap
134217890 167773604194436209727820972781048696 vlist
Which puts the trie in a better position than before, but it's worse than a VList unless the keys are very sparse.
Running performance tests, gives these results in ms. Populate was time to populate with a sequence of size random keys and value = key. Access was time to access all values in the key's range.
density % 1 2 5105075100
size
1000100000500002000010000200013331000 keys
16 5 1 1 1 1 0 trie-populate
13714 5 2 0 0 0 trie-access
9 5 1 0 1 0 1 hashmap-populate
19211 2 2 0 1 0 hashmap-access
3 2 2 0 0 0 0 vlist-populate
9 5 1 0 0 0 0 vlist-access
100001000000500000200000100000200001333310000 keys
583316 5 2 2 3 trie-populate
3701772713 2 2 1 trie-access
2529 211 4 4 5 hashmap-populate
1641092131 1 1 1 hashmap-access
77 5 3 3 3 0 1 vlist-populate
121 6 3 1 0 1 0 vlist-access
100000 10000000500000020000001000000200000133333100000 keys
6913442851781435764 trie-populate
582128301145519814834 trie-access
144155105110964566 hashmap-populate
804497184156695713 hashmap-access
244774431121127 vlist-populate
1241292211 2 1 1 vlist-access
500000 50000000 25000000 1000000050000001000000666666500000 keys
2866218812691407702620483 trie-populate
475722313089554470767484326 trie-access
1081972850817575473330 hashmap-populate
41212165974455337256253 hashmap-access
9695754203361929483 vlist-populate
11092801185811 8 6 vlist-access
Of course, it might be that my quick-and-dirty trie is a bit pants, but VList seems the way unless you have a low density, in which case HashMap.
Pete
code at http://www.tincancamera.com/examples/java/adt/