Search Algorithm in Java

Hey Guys,

I have written a crude search algorithm which extracts 4 fields from a database table into the String type of Java. Then this String object is tokenized and then within these tokens I search for the text the user has used.

Once, I faced this problem that one of the String pulled from the DB was so huge that I got the OutOfMemory exception. Simple stating it, the text was so big that my system could not store all that text in one String object.

I resolved this by adding more heap to java by doing:

java -Xmx512M MySearchClass

Now I am afraid that if there is another HUGE text in the DB , I can still run out of memory.

Do I just need to get MORE RAM to my system and allot really huge heap space, such as 4 GB or so , so that if need be, the JVM can use it ?

Suppose I get a server machine with 4 GB RAM, at some stage I'd still have an upper limit on the amount of text I can extract from the DB, right ?

Is there any other way to stream the text from DB and then search ? Remember that I am tokenizing the text in the algorithm, so I think I can't use a buffer, or can I ?

thanks a ton !

-AZ

[1181 byte] By [azaidi1a] at [2007-11-27 10:34:22]
# 1

And you aren't doing this as a SQL query because.... ?

cotton.ma at 2007-7-28 18:28:24 > top of Java-index,Java Essentials,Java Programming...
# 2

These are SQL queries. I use SQL queries to pull out the text in the DB and then store it in a String object. The problem is that the text in DB can get so huge that the String object cannot hold it anymore due to memory limits. The JVM gets assigned a heap size and the size of the String object can get really big to exceed the heap size.

I used : java -Xmx512M MySearchClass

That assigned 512MB to the JVM so I haven't encountered the OutOfMemory exception so far. But my guess is that for some other data set, the String can be so big that even 512MB might not be enough.

This search algorithm will be deployed on a server class machine with about 4 GB RAM. So we can give JVM 4GB heap space to use. But its a limitation again , right ?

These text in DB are extracted from PDF files.

azaidi1a at 2007-7-28 18:28:24 > top of Java-index,Java Essentials,Java Programming...