What is a DomParser/SaxParser?

Hi all,

I need to parse a couple of html files to make some changes to the tags. I was just going to use a simple bufferedReader/BufferedWriter with a sub string replacement method....but somebody suggested the use of a parser. Specifically a SaxParser. Whadayallthink? And for that matter, what is a SaxParser and how does it differ from a saxParser?

Regards,

Mat

[396 byte] By [MatLL] at [2007-9-26 4:48:16]
# 1

SAX stands for Simple API for XML. Apparently, it is event-driven and is used for accessing XML documents and extracting information from them. However, it does not support manipulation of the internal structure of the documents it reads.

DOM stands for Document Object Model, and it specifies how HTML and XML documents can be represented as objects. It differs from SAX in that it can manipulate the internal structure of the documents it reads, but I've read that any changes made to the document are only stored in memory and don't affect the file directly. You'd have to have your program do that.

I have not used either of these or XML yet, but I am learning about them currently.

HTH

Jeff

marendoj at 2007-6-29 18:38:28 > top of Java-index,Archived Forums,Java Programming...
# 2
A good place to start: http://java.sun.com/xml/tutorial_intro.html
cafal at 2007-6-29 18:38:28 > top of Java-index,Archived Forums,Java Programming...
# 3

DOM=Document Object Model:

SAX=Simple Api for Xml

I wish to tell that instead of going to sax or dom, u can very fastly achieve the purpose by string tokeniser as it is very fast comparing to sax or dom..(If the html is small)

If it is big, better to go for sax as it will not load the entire html into the memory where as Dom does, leading to slow performance.

In both cases the input html has to be well formed?

( to understand this read xml specification by w3c.org)

That is it.

Sriram

planswerme at 2007-6-29 18:38:28 > top of Java-index,Archived Forums,Java Programming...