Parsing String

Hi all,

I have a text file that needs to be parsed. The input is like:

Air Cleaners|Purifiers

Amusement Parks|Attractions

Antiques|Collectibles

I need to parse it based on "|" pattern, so the output sould be like:

"Air Cleaners", "Purifiers"

"Amusement Parks" , "Attractions"

"Antiques", "Collectibles"

I have the following code:

publicstatic ArrayList parse(String filename)

{

ArrayList parsedFields =null;

try

{

BufferedReader in =new BufferedReader(new FileReader(filename));

String str;

while ((str = in.readLine()) !=null)

{

String patternStr ="|";

String[] fields = str.split(patternStr);

for (int a = 0; a < fields.length; a++)

{

System.out.println("fields array = " + fields[a]);

}

// Trim the fields in String[] and make them lowercase

for (int a = 0; a < fields.length; a++)

{

fields[a] = fields[a].trim();

fields[a] = fields[a].toLowerCase();

}

}// while

in.close();

}// try

catch (Exception e)

{

e.printStackTrace();

}

return parsedFields;

}// parse()

When I ran the code with the input file that I have, I noticed that the patternStr is considered space and not "|". This is part of output:

fields array =

fields array = A

fields array = i

fields array = r

fields array =

fields array = C

fields array = l

fields array = e

fields array = a

fields array = n

fields array = e

fields array = r

fields array = s

fields array = |

fields array = P

fields array = u

fields array = r

fields array = i

fields array = f

fields array = i

fields array = e

fields array = r

fields array = s

Does any body have any idea why its happening? Any help is greatly appreciated.

[2923 byte] By [RonitTa] at [2007-11-27 7:56:58]
# 1
split accepts a regexp string, and | is a special regexp character. Turn it into this to escape it:String patternStr = "\\|";
gimbal2a at 2007-7-12 19:38:46 > top of Java-index,Java Essentials,Java Programming...
# 2

Thanks a lot, it worked!

Can you help me on one more thing as well; how can I parse the String based on "|" or "&", meaning how can I say that parse it either when patternStr is "|" OR "&" ? Some lines in the input file are like:

Air Cleaners|Purifiers --> "Air Cleaners", "Purifiers"

Amusement Parks & Attractions --> "Amusement Parks", "Attractions"

RonitTa at 2007-7-12 19:38:46 > top of Java-index,Java Essentials,Java Programming...
# 3
You need to escape the delimiter:String patternStr = "\\|";String[] fields = str.split(patternStr);That should work.Regards,Nikunj Manocha
manochanikunja at 2007-7-12 19:38:46 > top of Java-index,Java Essentials,Java Programming...
# 4
String patternStr = "[\\|\\&]";
manochanikunja at 2007-7-12 19:38:46 > top of Java-index,Java Essentials,Java Programming...
# 5

You probably won't need to trim() the strings if you add the whitespace the to the split() regex. Also, to avoid recompiling the regex every time you use it, you can create a Pattern object befroe starting the loop and using that: public static ArrayList parse(String filename)

throws IOException

{

Pattern splitPattern = Pattern.compile("\\s*[|&]\\s*");

ArrayList parsedFields = null;

BufferedReader in = new BufferedReader(new FileReader(filename));

String str = null;

while ((str = in.readLine()) != null)

{

String[] fields = splitPattern.split(str.toLowerCase());

// etc.

}

\\ etc.

}

uncle_alicea at 2007-7-12 19:38:46 > top of Java-index,Java Essentials,Java Programming...