Jtidy
Any one know where i can find an example of using Jtidy to parse an HTML file from the web?
I am hoping to use Jtidy and SAX for some web mining.
Any pointers will be much appreciated.
Any one know where i can find an example of using Jtidy to parse an HTML file from the web?
I am hoping to use Jtidy and SAX for some web mining.
Any pointers will be much appreciated.
> Any one know where i can find an example of using Jtidy to parse an HTML file from the web?
Yes; the JTidy website has examples, assuming you know how to get a web page as an input stream (which is unrelated to JTidy).
~
> Yes; the JTidy website has examples, assuming you
> know how to get a web page as an input stream (which
> is unrelated to JTidy).
>
Thanks. I hope to start by being able to parse an HTML file saved on disc and then proceed from there.
> Thanks. I hope to start by being able to parse an
> HTML file saved on disc and then proceed from there.
Fortunately, JTidy doesn't care where the HTML comes from; just get the file contents as an input stream, and you're all set.
Best of luck!
~
> Fortunately, JTidy doesn't care where the HTML comes
> from; just get the file contents as an input stream,
> and you're all set.
>
> Best of luck!
The example on Jtidy website actually takes three parameters, the first of which is the URL string that represents the file source!
thanks again.