Search String Array

I need to figure out the way to very quickly search the String[].

I may have up to 5000 string entries in the array (arraylist).

I know that I can do it with looping but I was wondering if there is a way to use Binary search to return multiple entries faster.

I need to match strings with contains logic , not an exact match.

Thanks

Mikhail

[378 byte] By [mgoncharova] at [2007-11-27 4:36:23]
# 1

> I need to figure out the way to very quickly search

> the String[].

>

> I may have up to 5000 string entries in the array

> (arraylist).

5000 is peanuts.

> I know that I can do it with looping but I was

> wondering if there is a way to use Binary search to

> return multiple entries faster.

A binary search is only possible on a sorted array.

> I need to match strings with contains logic , not an

> exact match.

Then looping through your array is your only option.

prometheuzza at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 2

Is it really necessary to use an array? Otherwise a LinkedHashMap would be an option. It combines the sequence of an array (with an iterator) with the performance of a HashMap. For your "fuzzy" search you need to adapt the equals()/hashCode() of your hashed key, which -- in your case -- would be some special string object.

Sincerely

Christian Ullenboom | http://www.tutego.com/

Christian.Ullenbooma at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 3

How would you realize a contains-based search using a single HashMap? That sounds non-sensical to me.

If you think to have to optimize the search in an Array of 5000 entries, there is something else wrong in your application.

Having a really big amount of entries, though, one would build indexes (e.g., on character sequences to bucket matching entries).

stefan.schulza at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 4

Please take a look at the implementation of a HashSet. It is based on a HashMap where the value is a dummy object. This answers the first question how contains() is implemented ?it is quite fast. Sure, we do not need a HashMap, but there is no LinkedHashSet today to give sequential access. And yes I would use an array too if the problem isn't a very time consuming one.

Best wishes,

Christian Ullenboom | http://www.tutego.com/

Christian.Ullenbooma at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 5

You are mixing two "contains". I did not ask for the contains-method of a Map.

The OP is talking about Strings matching based on a "contains logic", i.e., all Strings that contain the given Sequence of characters would match (or maybe he's going for regular expressions in the end). This is not (feasibly) solvable using a Map. You would have to put any possible pattern into the Map, which, for 5000 words, is quite a bunch (at least 26^(average number of letters) or so, my math is quite rusted).

stefan.schulza at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 6

Sorry for my unclarity. For "arbitrary" strings I cannot see a usage of a Map either. I've had a special solution in mind where you don't need an exact match but can find a canonical form of the string. With the declaration "I need to match strings with contains logic, not an exact match" I thought of a search with key objects in the map similar the following for managing strings independent of lower/upper case letters.

public class EqualsIgnoreCaseString

{

private final String string;

public EqualsIgnoreCaseString( String string )

{

this.string = string.toLowerCase();

}

@Override public int hashCode()

{

return string.hashCode();

}

@Override public boolean equals( Object obj )

{

if ( this == obj )

return true;

if ( obj == null )

return false;

if ( getClass() != obj.getClass() )

return false;

if ( string == null )

if ( ((EqualsIgnoreCaseString) obj).string != null )

return false;

return string.equals( ((EqualsIgnoreCaseString) obj).string );

}

}

The same is possible for soundex or a lot of other criteria.

Best wishes,

Christian Ullenboom | http://www.tutego.com/

Christian.Ullenbooma at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 7
Sure, if there is one sole static criteria, this could work.(Btw., according to your constructor, string cannot be null or the constructor will fail with NPE. Hence, you do not need the test on null in equals.)
stefan.schulza at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...
# 8
--null
3birena at 2007-7-12 9:46:33 > top of Java-index,Core,Core APIs...