Parsing an HTML file
Hello, I'm connecting to a website and am reading in the HTML, and need a way of recognising tags such as <link> and <item>
I did something before which pulls out <a href> links, how can I adapt this bit of code to get tags such as <link> or <item> ?
RL url =new URL(s1);
URLConnection conn = url.openConnection();
Reader read =new InputStreamReader(conn.getInputStream());
HTMLEditorKit kit =new HTMLEditorKit();
HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
kit.read(read, doc, 0);
HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.A);
while (it.isValid()){
SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
String link = (String)s.getAttribute();
if (link !=null){
System.out.println(link);
}
it.next();
}
}

