Parsing
Hi Everyone,
I have a parsed XML document using import org.xml.sax.* and the class extends HandlerBase.
An extract of the XML file looks like this:
<FILESEQUENCE>
<FILE TYPE = "Word">
<NAME>DetailsInce</NAME>
<SIZE>6100</SIZE>
</FILE>
<FILE TYPE = "Word">
<NAME>Addresses</NAME>
<SIZE>2000</SIZE>
</FILE>
<FILE TYPE = "Excel">
<NAME>Accounts</NAME>
<SIZE>2100</SIZE>
</FILE>
</FILESEQUENCE>
I need to count up the individual ATTRIBUTES. For example, the XML document has 3 word documents ie <File Type = "Word">. I need to create a report that presents the total number of word documents.
I also need to add up the total file size described in ELEMENTS ie <SIZE>6100</SIZE> and include the total size of all files in the XML document.
Can anyone suggest methods that will allow me to extract the data and create the report.
I would take this approach:
- Obtain a NodeList of all <FILE> Nodes.
- Create & initialize some data structure to store the # of occurrences of each file type. I personally would use a HashMap (key type = String, value type = int).
- Create & initialize a size counter to 0.
- Iterate through my NodeList for each <FILE> Node:
* Check to see if the TYPE attribute value exists yet as a key in the HashMap. If so, then increment the int value. If it does not already exist in the set, add it to the set with a value of 1.
* add the <SIZE> value of each file to counter.
I have given you an outline for writing a method to solve your problem. Can you come up with the Java code to implement the method?
PUHfyn,
This is the method I have so far for the attributes problem, based on your advice:
public void startElement(String elementName, AttributeList al) throws SAXException
{
//Executed when a start element is encountered
//elementName contains the name of the element and al contains
//a list of the attributes
String attributeName, attributeValue;
if(al.getLength()>0)//iteration process through the parsed document
for(int j = 0;j<al.getLength();j++)
{
attributeName = al.getName(j);
attributeValue = al.getValue(j);
{
if (attributeName.equals("TYPE"))
if (total.containsKey(attributeValue)) {
Integer temp = (Integer)total.get(attributeValue);
int temp2 = temp.intValue();
total.put(attributeValue, new Integer(temp2+1));
}
else if (!total.containsKey(attributeValue))
total.put(attributeValue, new Integer(1));
}
}
Iterator i = total.entrySet().iterator();
while (i.hasNext()) {
Map.Entry temp = (Map.Entry)i.next();
System.out.println(temp.getKey() + " " + temp.getValue());
}
This iterates through the parsed document and appears to run the method each time a "TYPE" attribute is encounted. Can you suggest how it can be amended so that it only presents the final totals ie. Word Files = 3, Excel Files = 2 etc.>
The method looks pretty good to me. I probably need to see an entire .java file to see where you are calling this method to see why
> "This iterates through the parsed document and appears to run the method
> each time a "TYPE" attribute is encounted"
If you are getting some output but wrong numbers, just play around with it some more too and you'll probably be able to get it working correctly.