How to distinguish between two elements with the same name in XML file

The following is my source code

public String LineExist(String line)

{

char cc='A';

Pattern Regex = Pattern.compile("<product>00000000<citation citation-type=\".*?\" id=\"\">");

for(Matcher RegexMatcher = Regex.matcher(line); RegexMatcher.find();)

{

String chkEntity = RegexMatcher.group();

int lastIndex = chkEntity.lastIndexOf("\">");

String subchkEntity1=chkEntity.substring(0,lastIndex);

line = line.replaceFirst(chkEntity,subchkEntity1+"ref00"+cc+"\">");

cc++;

}

return line;

}

I have two types of the elementcitation. First one is with the parent element<product>, that is

<product>

<citation citation-type="book" id="">

</citation>

</product>

and the second one is without the parent element <product>, that is

<citation citation-type="book" id="">

From my code above, for each new line encountered in theXML files, I am substituting the same with00000000 and am able to insert attribute value ofid for the first type.

My problem is how do I distinguish between the two types of<citation and insert the attribute values for the second type

Currently my xml output is

><?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE article SYSTEM"ABC.dtd">

<article xmlns="XYZ">

<front>

<product>

<citation citation-type="book" id="ref00A">

</citation>

</product>

<product>

<citation citation-type="book" id="ref00B">

</citation>

</product>

</front>

<ref-list>

<title>REFERENCES</title>

<citation citation-type="book" id="">

...

<citation citation-type="book" id="">

...

...

<citation citation-type="book" id="">

...

<citation citation-type="book" id="">

</citation>

</ref-list>

</article>

But I need my output to be the following

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE article SYSTEM"ABC.dtd">

<article xmlns="XYZ">

<front>

<product>

<citation citation-type="book" id="ref00A">

</citation>

</product>

<product>

<citation citation-type="book" id="ref00B">

</citation>

</product>

</front>

<ref-list>

<title>REFERENCES</title>

<citation citation-type="book" id="1">

...

<citation citation-type="book" id="2">

...

...

<citation citation-type="book" id="3">

...

<citation citation-type="book" id="4">

</citation>

</ref-list>

</article>

[4317 byte] By [sony_tja] at [2007-11-27 7:38:46]
# 1
Could anyone please provide me the solution
sony_tja at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 2
You need to use an XPath or a DOM parser rather than this simplistic regular-expression approach.
ejpa at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 3
With the given coding only could someone please tell me the solution for my problem.
sony_tja at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 4
> With the given coding only could someone please tell> me the solution for my problem.um.... if the given coding worked.... you wouldn't need another solution? I think the poster above (ejp) has suggested a valid suggestion. Why not implement it?
petes1234a at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 5

> > With the given coding only could someone please

> tell

> > me the solution for my problem.

>

> um.... if the given coding worked.... you wouldn't

> need another solution? I think the poster above

> (ejp) has suggested a valid suggestion. Why not

> implement it?

Exactly. If 'the given coding only' worked, you would already have a solution, you wouldn't have a problem, and you wouldn't be posting this question.

You have to change your 'given coding' somehow. For a start , as you need to detect context, you will have to consider more than one line at a time, which throws out your present method signature. Much the easiest way to do that is to use XPath, as I already said. If you want to do it some other, harder, way, good luck.

ejpa at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 6

You can use the same regex to match both kinds of element: Pattern p = Pattern.compile("(<product>00000000)?<citation citation-type=\".*?\" id=\"\">");

Matcher m = p.matcher(line);

while (m.find())

{

if (m.start(1) != -1)

{

// it has a <product> parent

}

else

{

// it doesn't

}

}

But you really should use XML-specific tools to process XML. Using hackish approaches like this, you'll end up introducing errors that an XML library would have prevented. That's one of the main reasons why XML was invented.

uncle_alicea at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 7

I hope this is what you suggested me to do

public String ProdCitLine(String line)

{

char cc='A';

int num=1;

Pattern p = Pattern.compile("(<product>00000000)?<citation citation-type=\".*?\" id=\"\">");

Matcher m = p.matcher(line);

while (m.find())

{

if (m.start(1) != -1)

{

// it has a <product> parent

for(Matcher RegexMatcher1 = p.matcher(line); RegexMatcher1.find();)

{

String chkEntity1 = RegexMatcher1.group();

int lastIndex = chkEntity1.lastIndexOf("\">");

String subchkEntity1=chkEntity1.substring(0,lastIndex);

line = line.replaceFirst(chkEntity1,subchkEntity1+"ref00"+cc+"\">");

cc++;

}

}

else

{

// it doesn't

for(Matcher RegexMatcher2 = p.matcher(line); RegexMatcher2.find();)

{

String chkEntity2 = RegexMatcher2.group();

int lastIndex = chkEntity2.lastIndexOf("\">");

String subchkEntity2=chkEntity2.substring(0,lastIndex);

line = line.replaceFirst(chkEntity2,subchkEntity2+"ref00"+num+"\">");

num++;

}

}

}

return line;

}

My output is

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE article SYSTEM "ABC.dtd">

<article xmlns="XYZ">

<front>

<product>

<citation citation-type="book" id="ref00A">

</citation>

</product>

<product>

<citation citation-type="book" id="ref00B">

</citation>

</product>

</front>

<ref-list>

<title>REFERENCES</title>

<citation citation-type="book" id="ref00C">

<citation citation-type="book" id="ref00D">

</citation>

</ref-list>

</article>

But I need my XML output as

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE article SYSTEM "ABC.dtd">

<article xmlns="XYZ">

<front>

<product>

<citation citation-type="book" id="ref00A">

</citation>

</product>

<product>

<citation citation-type="book" id="ref00B">

</citation>

</product>

</front>

<ref-list>

<title>REFERENCES</title>

<citation citation-type="book" id="ref001">

<citation citation-type="book" id="ref002">

</citation>

</ref-list>

</article>

sony_tja at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...
# 8

I rectified my code and now it's working. Thanks a lot for the help uncle_alice

My code now is

public String ProdCitLine(String line)

{

char cc='A';

int num=1;

Pattern Regex = Pattern.compile("(<product>00000000)?<citation citation-type=\".*?\" id=\"\">");

Matcher m = Regex.matcher(line);

while (m.find())

{

if (m.start(1) != -1)

{

// it has a <product> parent

String chkEntity1 = m.group();

int lastIndex = chkEntity1.lastIndexOf("\">");

String subchkEntity1=chkEntity1.substring(0,lastIndex);

line = line.replaceFirst(chkEntity1,subchkEntity1+"ref00"+cc+"\">");

cc++;

}

else

{

String chkEntity2 = m.group();

int lastIndex = chkEntity2.lastIndexOf("\">");

String subchkEntity2=chkEntity2.substring(0,lastIndex);

line = line.replaceFirst(chkEntity2,subchkEntity2+"ref00"+num+"\">");

num++;

}

}

return line;

}

sony_tja at 2007-7-12 19:19:22 > top of Java-index,Java Essentials,New To Java...