DOM text node content

Hello, i am trying to develop a web mining application that gets data from a web page. I have used Jtidy to provide me with a DOM tree of the web page contents.

I have traversed the document and got to a text node (see last line of code bellow), that i would like to view its contents.

I don't know how to get the contents of the text node.

Any pointers will be much appreciated.

The code except and output below might shine some light.

org.w3c.dom.NodeList contents_of_final_table = finalNode.getChildNodes();

org.w3c.dom.Node tableDATA_of_finalNode = contents_of_final_table.item(i) ;

System.out.println(tableDATA_of_finalNode.getFirstChild().getNextSibling().getFirstChild().getNodeType());

org.w3c.dom.Node text =tableDATA_of_finalNode.getFirstChild().getNextSibling().getFirstChild();

System.out.println(text.getNodeName());

OUT PUT

init:

deps-module-jar:

deps-ear-jar:

deps-jar:

compile-single:

run-main:

#text

BUILD SUCCESSFUL (total time: 0 seconds)

[1080 byte] By [Antananarivoa] at [2007-11-27 11:56:35]
# 1

under each node there will be #text node ,which wont contain any data

the actual data will be present under the # text node in the hierarchy

srinivassa at 2007-7-29 19:08:25 > top of Java-index,Java Essentials,Java Programming...
# 2

Thanks for the reply. I still have a problem though ; when i get further down the tree and find a #text node that actually has some data how will i get it?

Antananarivoa at 2007-7-29 19:08:25 > top of Java-index,Java Essentials,Java Programming...
# 3

you can google and can find the solution

the logic is ignore the #text node

and go 1 level down the tree and get your data

for exampe

<name>me</name>

is actually stored as

--name

--

|#text

|me

srinivassa at 2007-7-29 19:08:25 > top of Java-index,Java Essentials,Java Programming...