Cost of parsing a DOM object
Can somebody tell me how expensive it is to parse a DOM object.
My code builds a DOM object from DB on initialization (to provide for caching) and keeps that live in memory during the whole execution time. There are several threads running and each requires some parameters which were earlier populated in a hashtable from the DB for each thread.
Now since this caching mechanism is in place i can provide these threads with the parameters directly one by one or as a hashtable like earlier except this time it wud be from the cache instead of the DB.
This hashtable is parsed everytime a new request is processed. The threads process upto 800 -1000 records a minute. If i provide the parameter directly from the cache then i'll have to parse the DOM object everytime.
So can somebody tell me whether parsing the hashtable is a good option or parsing the DOM , considering both of them remain in memory during the whole execution time.
[967 byte] By [
stanfya] at [2007-11-26 16:31:02]

"Parsing the hashtable" is a phrase that makes no sense. "Parsing" is when you take some serial data (like a stream of bytes, or a file) that has an implicit structure to it beyond the one-dimensial serial data, and then create a data structure that represents the implicit structure (it makes the structure explicit). A hashtable isn't serial data.
If your goal is to speed up processing, it can make sense to cache, but it's not clear why you'd want to cache stuff in a DOM object as opposed to, say, a hashtable. But it's not exactly clear either whether that's what you meant. Maybe you mean that you're using XML file(s) as a database and you're putting the cache in front of the DOM object? I can't tell.
thanx for the reply...
i'll put it this way...the DB (which is not an xml file but mysql) has an id for which i have name-value pairs. for each id names can be common..
so it can be represented in an xml as
<ids>
<id>
<param name="" value=""/>
<param name="" value=""/>
</id>
.....</ids>
keeping it in a hashtable wud i thnk be a bit complex...i'm not writing this to an xml file but building a DOM object and kepping it alive in a singleton class during the whole execution.
now my thread needs values based on id and param name...many such threads are running concurrently each processing many records a minute...
now i can either return a HT of ids and values and then pick values based on requesters id from that HT for each thread.....OR.... i can pasre the DOM every time and return the value based on id and param name thru a method of the singlton class.
i wanted to know which of these 2 wud be a better option....or if there is any other idea then u're most welcome..
But why are you creating a DOM object at all? That's almost certainly going to be harder and more computationally expensive than just using a hashtable. For that matter just using MySql is probably sufficient.
You're still not using the word "parse" correctly. If you have a DOM object, then you don't need to parse it anymore. You parse it from an XML file. I think you mean you could traverse the DOM object.
Anyway...if multiple objects are going to read the data, submitting their own ID numbers to get a subset of data, then rather than create a hashtable representing the whole DOM, you could create a smaller hashtable representing only each object's portion of the data.The problem with returning hashtables is that then you get data expiration issues -- presumably you're using the singleton object because it's representing data that can change arbitrarily and from different sources. If you return hashtables you won't know how current the data in the returned hashtables is.
Another possibility is that you could use a hashtable for caching, and keep it in the singleton object itself. That way you can control data expiration.
Probably the best thing you could do is hide the implementation behind the singleton class, not allowing any implementation details to seep into the interface. Then do the simplest possible thing (which would probably mean just using MySQL and skipping the DOM, if possible) and only add caching or whatever if later profiling suggested it was necessary.