Parsing text from a website
I'm still boning up on my Java skills, and I'm wondering if you guys might have any ideas as to an issue I'm having with parsing text from websites. I'm designing a program to help me with a monthly chore, which will pull down information from various local venue websites and then store data/artist information in a linked list for retrieval. I'm designing a separate class for each venue, as each page formats its information differently on each page.
This is an example of the kind of page I'm dealing with:
http://www.catscradle.com/schedule.html
The date and artist information is stored in a table format on this page, and I'm currently parsing the page line-by-line, using the following code:
URL pageURL =new URL(currMonth);
URLConnection connect = pageURL.openConnection();
HttpURLConnection hconnect = (HttpURLConnection)connect;
int code = hconnect.getResponseCode();
if(code != HttpURLConnection.HTTP_OK){
thrownew IOException();
}
InputStream in = connect.getInputStream();
BufferedReader reader =new BufferedReader(new InputStreamReader(in));
boolean doneFlag =false;
while(!doneFlag){
//parsing algorithm goes heere
}
In the parsing algorithm I have an if statement that checks to see if it is on a 'date' line by using String.indexOf(String a), and, if so, it cleans all HTML information out of the line and stores the date in my linked list.
Unfortunately, I can't quite figure out how I would go about parsing any textual information after this 'date' line, and I was wondering if A. I should be using a line-by-line method to parse this page and B. if there would be a better way to parse data stored in a table format. Any ideas or suggestions would be highly appreciated. Thanks!

