Trie of Integer literals as a way to map an int to an Object

I'm trying to create a bit-level trie using integers for a memory constrained application

For some reason I think that using such a Trie will be better (faster and less memory) than a HashTable with Integer objects mapping Objects.

Is this intuition right or should I just use a Hashtable?

Thanks

Mordan

[335 byte] By [Mordana] at [2007-11-26 21:45:30]
# 1
Try it and see. It's probably easier to write the trie than do the analysis.Pete
pm_kirkhama at 2007-7-10 3:34:20 > top of Java-index,Other Topics,Algorithms...
# 2

I meant for speed - try it with your loadings.

Each node in the trie has three members:public class BitTrie <T> {

final Node<T> root = new Node<T>();

static class Node <U> {

Node<U> zero;

U value;

Node<U> one;

}

...

}

Treating the key as an unsigned 32 bit value, there is at least one node per value, and at most the same number as the maximum key value. For a fully dense trie, ie there are N values and the keys go from 0..N-1, there are N nodes.

In the hash map, the entries have four members:static class Entry<K,V> implements Map.Entry<K,V> {

final K key;

V value;

Entry<K,V> next;

final int hash;

}

In addition, there is an array of entries which has size 2^ceiling(log_2(N/0.75)). The size is unaffected by the range of key values. If you have key values outside of range of the autoboxing cache, these also consume memory.

So if you have N nodes and N/density key range, assuming 2 words per object and one word per member you get these sizes:

density % 110255075100

size

5500502010 6 5 keys

2500250100503025 trie

383838383838 hashmap

10100010040201310 keys

50005002001006550 trie

767676767676 hashmap

5050005002001006650 keys

2500025001000500330250 trie

428428428428428428 hashmap

100100001000400200133100 keys

50000500020001000665500 trie

856856856856856856 hashmap

1000100000100004000200013331000 keys

50000050000200001000066655000 trie

804880488048804880488048 hashmap

10000100000010000040000200001333310000 keys

50000005000002000001000006666550000 trie

763847638476384763847638476384 hashmap

100000 100000001000000400000200000133333100000 keys

50000000500000020000001000000666665500000 trie

862144862144862144862144862144862144 hashmap

1000000 100000000 100000004000000200000013333331000000 keys

500000000 50000000 20000000 1000000066666655000000 trie

809715280971528097152809715280971528097152 hashmap

These overestimate the size of the trie for low densities, but even with a fully packed trie the size savings are not massive.

Pete

pm_kirkhama at 2007-7-10 3:34:20 > top of Java-index,Other Topics,Algorithms...
# 3

Using randomly filling the Trie to the given size, and also comparing with a Vlist, gives the following results:

density % 110255075100

size

5500502010 6 5 keys

1457575404535 trie

383838383838 hashmap

56610062402626 vlist

5050005002001006650 keys

1710940715525425405 trie

428428428428428428 hashmap

8270566304170170100 vlist

1000100000100004000200013331000 keys

3466519500137401056591707555 trie

804880488048804880488048 hashmap

131174164684168211421141084 vlist

1000000 100000000 100000004000000200000013333331000000 keys

34509140 19333755 13894290 1049758591864707621435 trie

809715280971528097152809715280971528097152 hashmap

134217890 167773604194436209727820972781048696 vlist

Which puts the trie in a better position than before, but it's worse than a VList unless the keys are very sparse.

Running performance tests, gives these results in ms. Populate was time to populate with a sequence of size random keys and value = key. Access was time to access all values in the key's range.

density % 1 2 5105075100

size

1000100000500002000010000200013331000 keys

16 5 1 1 1 1 0 trie-populate

13714 5 2 0 0 0 trie-access

9 5 1 0 1 0 1 hashmap-populate

19211 2 2 0 1 0 hashmap-access

3 2 2 0 0 0 0 vlist-populate

9 5 1 0 0 0 0 vlist-access

100001000000500000200000100000200001333310000 keys

583316 5 2 2 3 trie-populate

3701772713 2 2 1 trie-access

2529 211 4 4 5 hashmap-populate

1641092131 1 1 1 hashmap-access

77 5 3 3 3 0 1 vlist-populate

121 6 3 1 0 1 0 vlist-access

100000 10000000500000020000001000000200000133333100000 keys

6913442851781435764 trie-populate

582128301145519814834 trie-access

144155105110964566 hashmap-populate

804497184156695713 hashmap-access

244774431121127 vlist-populate

1241292211 2 1 1 vlist-access

500000 50000000 25000000 1000000050000001000000666666500000 keys

2866218812691407702620483 trie-populate

475722313089554470767484326 trie-access

1081972850817575473330 hashmap-populate

41212165974455337256253 hashmap-access

9695754203361929483 vlist-populate

11092801185811 8 6 vlist-access

Of course, it might be that my quick-and-dirty trie is a bit pants, but VList seems the way unless you have a low density, in which case HashMap.

Pete

code at http://www.tincancamera.com/examples/java/adt/

pm_kirkhama at 2007-7-10 3:34:20 > top of Java-index,Other Topics,Algorithms...