How to improve the performance of serialization/deserialization?

Hi, Friends,

I have a question about how to improve the performance of serialization/deserialization.

When an object is serialized, the entire tree of objects rooted at the object is also serialized. When it is deserialized, the tree is reconstructed. For example, suppose a serializable Father object contains (a serializable field of) an array of Child objects. When a Father object is serialized, so is the array of Child objects.

For the sake of performance consideration, when I need to deserialize a Father object, I don't want to deserialize any Child object. However, I should be able to know that Father object has children. I should also be able to deserialize any child of that Father object when necessary.

Could you tell me how to achieve the above idea?

Thanks.

Youbin

[834 byte] By [youbin1] at [2007-9-26 5:25:12]
# 1

If you want more control over 'Serialization' then you should look into the Externalizable interface. This will give you more control.

With your idea you would still have to read in all of the bytes, so you would have to read them in and store them some where.. then unpack them on demand. This would probably require some custom proxy classes that you would have to write for each situation that you want this unpack on demand functionality.

I don't think transmitting data to only ignore it is the correct solution. If you are serializing a lot of data that never gets used, then maybe you should consider not transmitting that data at all in the first place.. and instead place a second remote call in if and only if that data is needed. There is ofcourse a balance between network concerns and performance.

[A healthy boost can be gained by the following tried and tested manner:]

I believe serializing writes field names+data, if you are not too concerned about version clashes then you could drop the field names from the data stream.- again this is by using the Externalizable interface.

I hope that helps, good luck improving the performance of your program.

- Chris.

Brainbench MVP Java2.

cjwk at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 2
Chris,I dont have a BIG hands on with Serialization and Exernalization but couldnt this be achieved by makeing the not so frequently used objects transient, and manualy serializing and de-serializing them as per request?Omer
Oaq at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 3
if you mark a variable with a transient key word - it will be ignored by serialization/
dsklyut at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 4

You could try something like this...

import java.io.*;

import java.util.*;

class Child implements Serializable {

int id;

Child(int _id) { id=_id; }

public String toString() { return String.valueOf(id); }

}

class Father implements Serializable

{

Child[] children = new Child[10];

public Father() {

Arrays.fill(children, new Child(1001));

}

public void readObject(ObjectInputStream stream)

throws IOException, ClassNotFoundException

{

int numchildren = stream.readInt();

for(int i=0; i<numchildren; i++)

children[i] = (Child)stream.readObject();

stream.close();

}

public void writeObject(ObjectOutputStream stream) throws IOException

{

stream.writeInt(children.length);

for(int i=0; i><children.length; i++)

stream.writeObject(children[i]);

stream.close();

}

Child[] getChildren() { return children; }

}

class FatherProxy

{

int numchildren;

String filename;

public FatherProxy(String _filename) throws IOException

{

filename = _filename;

ObjectInputStream ois =

new ObjectInputStream(new FileInputStream(filename));

numchildren = ois.readInt();

ois.close();

}

int getNumChildren() { return numchildren; }

Child[] getChildren() throws IOException, ClassNotFoundException

{

ObjectInputStream ois =

new ObjectInputStream(new FileInputStream(filename));

Father f = (Father)ois.readObject();

ois.close();

return f.getChildren();

}

}

public class fatherref

{

public static void main(String[] args) throws Exception

{

// create the serialized file

Father f = new Father();

ObjectOutputStream oos =

new ObjectOutputStream(new FileOutputStream("father.ser"));

oos.writeObject(f);

oos.close();

// read in just what is needed -- numchildren

FatherProxy fp = new FatherProxy("father.ser");

System.out.println("numchildren: " + fp.getNumChildren());

// do some processing

// you need the rest -- children

Child[] c = fp.getChildren();

System.out.println("children:");

for(int i=0; i><c.length; i++)

System.out.println("i " + i + ": " + c[i]);

}

}

>

CWalker807 at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 5

In the original post youbin1 said 'I should also be able to deserialize any child of that Father object when necessary. '

So I answered the question from the point of view that the data had to be sent and it was the client who made the choice as to what was unpacked and when. Making a field transient will prevent Serialization from sending it in the first place, true.

youbin also asked for a way to detect that the data wasn't sent and to then retrieve that data separately.. I think we need to know more information about the environment he is serializing in.. -ie is it remote network calls, file storage, interprocess comms, long/short term storage, EJBs etc etc.

At the end of the day the tools that one has is the transient keyword, and the Serialization and Externalizable interfaces.

- Chris.

Brainbench MVP Java2.

cjwk at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 6

> if you mark a variable with a transient key word - it

> will be ignored by serialization/

Yes I know, what I ment was that when you serialize the containing object the contained object will be ignored, but you could do its serialization on the side when ever its required, or am I on the wrong track?

Regards

Omer

Oaq at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 7

Friends, thank all of you.

Let me describe more precisely my objectives.

I am working for image processing.

I have tree organization of some objects: an object has some children who have also some children ...

Every object contains a lot of data ...

The environment is something like (but not exactly the same) RMI. All objects should be serialized and stored in different files.

The files are organized in a tree:

..\first\firstGeneration.ser

..\first\second1\secondGeneration1.ser

..\first\second2\secondGeneration2.ser

..\first\second1\third1\thirdGeneration1.ser

...

So I would like to serialize all my objects into different files following that tree.

Then I would like to be able to deserialize any object, and also know its children info (but not deserialize them).

Assume that I deserialize secondGeneration1.ser.

Then I should know secondGeneration1 has how many children.

Assume now I deserialize firstGeneration. Then if I want to deserialize its first child (say secondGeneration1). As I have already deserialized that objet, I should get directly its handle.

I appreciate the example from Cwalker807.

Perhaps that example can server as the starting point.

Regards,

Youbin

youbin1 at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...
# 8

Make accessor methods for your objects and in there check if the object is not there deserialize it, else just return the refrence.

e.g.

String getFirstName()

{

if(this.fristName == null)

{

this.firstName = deserializationRoutiene();

}

return this.firstName;

}

Hope this helps

Regards

Omer

Oaq at 2007-6-29 19:32:52 > top of Java-index,Archived Forums,Java Programming...