Help regarding

Hi, i am New to java and I am parsing a file.The file looks like

=========== file starts ================================

1: Kotloff KL.<-- authors

Bacterial diarrheal pathogens.<-- Title of the paper

Scand J Infect Dis.<-- Journal

2: Ivanoff B.

[Traveller's diarrhea: which vaccines]

J Pak Med Assoc.

3: Fournier JM.

[The current status of research on a cholera vaccine?]

J Pharm Sci.

4: Keddy KH, Koornhof HJ.

Cholera--the new epidemic?.

S Afr Med J.

=========== file ends ================================

The problem i am facing here is that when i am using the "indexOf" for the string to specify ....for the "title of paper"...there are different ending criteria. Is there any why i can specify that the ending criteria is either a " . "(dot) or "]" or a "?". here is my code which works fine if the ending criteria is a "."(dot).but when it comes to " ] " its giving an error which isjava.lang.StringIndexOutOfBoundsException: String index out of range: -1

at java.lang.String.substring(String.java:1938)

at TestClass.main(TestClass.java:21)

here is my code.......

// This file uses extractor.java file to format the text file 'result.txt'

// method ext.extract returns the formatted array of strings which I write

// to 'result.txt' file with newline character(\r\n) added.

import java.io.*;

publicclass TestClass

{

publicstaticvoid main(String [] args)

{

extractor ext =new extractor();

String [] result = ext.extract("myFile.txt");

try

{

int i;

for(i=0;result[i]!=null;i=i+3)

{

int index=0;

String Authors =result[i].substring((result[i].indexOf(':')+3),(result[i].indexOf('.')+1));

String Title = result[i+1].substring(index,(result[i+1].indexOf('.')));

String Journal = result[i+2].substring(index,(result[i+2].indexOf('.')+1));

System.out.println(Authors);

System.out.println(Title);

System.out.println(Journal);

}//end of for.

}//end of try.

catch(Exception ex)

{

ex.printStackTrace();

}

}

}

====== the output ========

Kotloff KL.

Bacterial diarrheal pathogens.

Scand J Infect Dis.

java.lang.StringIndexOutOfBoundsException: String index out of range: -1

at java.lang.String.substring(String.java:1938)

at TestClass.main(TestClass.java:21)

Press any key to continue...

[3568 byte] By [GJaina] at [2007-11-27 10:07:45]
# 1

There's no need to specify the ending index if you just want the substring to contain all the the end of the string. This will return the String that contains everything after the colon:

String Authors =result[i].substring((result[i].indexOf(':')+3));

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#substring(int)

If you're trying to remove the [ ] from the paper title, then check if they actually exist in the String before taking a substring.

hunter9000a at 2007-7-13 0:44:03 > top of Java-index,Java Essentials,New To Java...
# 2

> There's no need to specify the ending index if you

> just want the substring to contain all the the end of

> the string. This will return the String that contains

> everything after the colon:

> [code]String Authors

> =result.substring((result.indexOf(':')+3));[/cod

> e]

>

> http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Stri

> ng.html#substring(int)

>

> If you're trying to remove the [ ] from the paper

> title, then check if they actually exist in the

> String before taking a substring.

Thanks for the help....but i am trying to get the those values in a variable.so the

Authors will go to variable AUTHOR,

Title will go to variable TITLE,

Journal will go to variable JOURNAL.

so thats why i am doing these things. That is why i need the ending index to tell where the authors are ending and title is starting. Can u help me with this?...thanks in advance.

GJaina at 2007-7-13 0:44:03 > top of Java-index,Java Essentials,New To Java...
# 3

But you already have the authors separated from title and journal. result[i ] is just the String "1: Kotloff KL."

correct (taking the first as an example)? So you only need to take the substring starting after the : and ending at the end of the string. Likewise for the title. If you want to leave the [ ] or periods in the titles, then just copy the whole string over. If you want to remove the [ ], then you have to check if they exist first, since there's no standard format for the file.

Take the title in result[i+1]

if the indexOf("[") is not -1// there's a [ in the string

take the substring starting after that index

if the indexOf("]") is not -1// there's a ] in the string

take the substring up to that index

// repeat for each character you want to remove

hunter9000a at 2007-7-13 0:44:03 > top of Java-index,Java Essentials,New To Java...
# 4
Thanks man ...thanks for the help .... i got your point.
GJaina at 2007-7-13 0:44:03 > top of Java-index,Java Essentials,New To Java...