Find substrings in a map?
Hi all,
I'm trying to write a phrase counter to analyze test files. The key is a string, and the value is an ArrayList of integers. The problem I'm encountering is that substrings of later equivalent values still show up in the map, i.e.:
This is an=[1,3]
is an example of=[1,3]
is an example=[1.3]
This is an example of=[1,3]
Obviously, these are all the same reference, with the same value mapped to them, and the only one I need to keep is the last one. Is there any way to search through the LinkedHashMap that contains these mappings and delete any mapping that is a substring of another?
Any help would be great, I just can't figure this out!
Thanks,
Jezzica85
Message was edited by:
jezzica85
[776 byte] By [
jezzica85a] at [2007-11-27 1:52:54]

Thank you, but that was only an example. That happens multiple times with different substrings and sets of strings in my map. Do you know of any way to do this with multiple substrings and sets of strings?Thanks,Jezzica85
If your example isn't valid, then no answer will be either. If you want further assistance, post exactly the conditions that can exist - ie, the rules that apply.
OK, I guess I'll try again to explain what's going on and try to be more clear.
Inside the map, there is a list of phrases, ranging from 3 to 20 words in length. For those phrases that appear in the document more than once (that is, the length of their arrayList value is 2 or more), I need to know if they are contained within any other phrase keys in the map.
So, if part of the map was like this:
This is an example=[2,4,6]
This is an=[2,4,6]
is another example of this=[1,3,5]
is another example=[1,3,5]
The revised map after the substrings were taken out would be:
This is an example=[2,4,6]
is another example of this=[1,3,5]
The second entry of the original map would be taken out because it was contained by and occurred in the same positions as the first, and the fourth entry of the original map would be taken out because it was contained by and occurred in the same positions as the third.
Is this any clearer?
Thanks for looking and trying to help,
Jezzica85
My solutions still applies.for everything associated with [2,4,6] keep the longest string, delete any others.Same thing for those strings associated with [1,3,5]