Jtidy

Any one know where i can find an example of using Jtidy to parse an HTML file from the web?

I am hoping to use Jtidy and SAX for some web mining.

Any pointers will be much appreciated.

[203 byte] By [Antananarivoa] at [2007-11-27 11:12:49]
# 1

> Any one know where i can find an example of using Jtidy to parse an HTML file from the web?

Yes; the JTidy website has examples, assuming you know how to get a web page as an input stream (which is unrelated to JTidy).

~

yawmarka at 2007-7-29 13:57:03 > top of Java-index,Java Essentials,Java Programming...
# 2

> Yes; the JTidy website has examples, assuming you

> know how to get a web page as an input stream (which

> is unrelated to JTidy).

>

Thanks. I hope to start by being able to parse an HTML file saved on disc and then proceed from there.

Antananarivoa at 2007-7-29 13:57:03 > top of Java-index,Java Essentials,Java Programming...
# 3

> Thanks. I hope to start by being able to parse an

> HTML file saved on disc and then proceed from there.

Fortunately, JTidy doesn't care where the HTML comes from; just get the file contents as an input stream, and you're all set.

Best of luck!

~

yawmarka at 2007-7-29 13:57:03 > top of Java-index,Java Essentials,Java Programming...
# 4

> Fortunately, JTidy doesn't care where the HTML comes

> from; just get the file contents as an input stream,

> and you're all set.

>

> Best of luck!

The example on Jtidy website actually takes three parameters, the first of which is the URL string that represents the file source!

thanks again.

Antananarivoa at 2007-7-29 13:57:03 > top of Java-index,Java Essentials,Java Programming...