Memory footprint - serialized vs real
Good day to the forum. I have been tasked with improving the data construct in an application. I have written a couple of different classes to handle the data and want to compare to see which is a smaller memory footprint. To do this I am writing the objects to ByteArrayOutputStream. How does the reulting size differ from the real memory usage? Is this a valid test?
Thanks for any insights as usual.
ST
> No opinions?
I never tried it and at least for me it'd be difficult to find out - I don't know the way the JVM handles all the references in memory. IMO chances are that either the sizes are equal, or that usual in-memory storage is smaller because of pooling (Strings, Integers) and possible (don't know) lack of reference type declarations. A serialized reference always has something like
[strange char]java.lang.string[strange chars]theName
inside. This way of storing the type might not be necessary for in-memory references. But as I said, I have no clue.
maybe you should use a profiler...if you do, compare the memory usage to the size of the ByteArrayOutputStream that you get, and post the results here.... ;-) I think, it would be interesting for a lot of people.Sorry for not helping much
> Is this a valid test?
No. The serialized representation of an object contains information about the class along with the object's data. Take a look at the RMI specification (I think) for more info (or dump the serialized output).
The in-memory representation of an object contains 8 bytes of overhead for normal objects, and (I believe) 12 bytes for an array (you can find this in one of the Hotspot docs). The actual object data takes up either 4 or 8 bytes per member (longs and double take 8, other primitives and references take 4). Arrays take length * N bytes plus padding, where N = 1 for byte and boolean, 2 for short, 4 for int/float/reference, and 8 for long/double).
Thanks for the responses. I'm aware that the serialized representation contains information about the object other than the data but that's my point. Barring an analysis tool I feel that this data, including the class information, is pretty close to what resides in memory. It seems like a good side by side comparison between the size of 2 objects. Of course the serialized object will also cache data so maybe it's not a good method. Hmmm. I've run the profiler against it but I can't figure out the total size of my test objects. For example the original data management came from a HashMap. In the profiler I see entries for the stored string values, hash entry, map entry, etc.
Can someone point me in the right direction to do a real world side by side test of 2 data management methods? I know that I have come up with a more efficient method (in terms of memory) however I need real world statts to sell it to the folks upstairs.
Thanks as always, Forum
ST
Tiger has introduced an "Instrumentation" interface which, amongst other functions, will return the memory size of an object. Granted it's a complex job to get at it; you need to run an "agent" class with a premain() method to which the Instrumentation object is passed. But it should give you a pretty definitive answer.
objA.x = "Hello";
objB.x = "Hello";
There will only be one String "Hello" in memory. There will be two
Strings "Hello" on your disk. The same for other shared references.
Yup, and thanks. It looks like I'll just have to do some tests to see which object runs out of memory first, sort of a poke the cat method.
Thanks for the help
ST