How to prevent outOfMemoryErrors when performing XSLT transformations

Hello,

We frequently have out of memory errors when our XSLT transformations run depending on the size of our source xml documents.

Here is the code that causes the above error to occur:

System.out.println("XSLT transformation running...");

TransformerFactory tFactory = TransformerFactory.newInstance();

Transformer transformer = tFactory.newTransformer(new StreamSource(args[2]));

transformer.transform(new StreamSource(args[0]),new StreamResult(new FileOutputStream(args[3])));

System.out.println("XSLT transformation complete!");

and the imports

import java.io.FileNotFoundException;

import java.io.FileOutputStream;

import java.io.File;

import java.io.IOException;

import javax.xml.transform.Transformer;

import javax.xml.transform.TransformerConfigurationException;

import javax.xml.transform.TransformerException;

import javax.xml.transform.TransformerFactory;

import javax.xml.transform.stream.StreamResult;

import javax.xml.XMLConstants;

import javax.xml.transform.Source;

import javax.xml.transform.sax.SAXSource;

import javax.xml.transform.stream.StreamSource;

import javax.xml.validation.Schema;

import javax.xml.validation.SchemaFactory;

import javax.xml.validation.Validator;

import org.xml.sax.InputSource;

import org.xml.sax.SAXException;

import org.xml.sax.XMLReader;

import org.xml.sax.helpers.XMLReaderFactory;

We cannot increase memory on our servers nor can we reduce the size of our source documents. Is there a way to prevent this kind of errors from occurring or reduce memory footprint? (even though if it means having the transformation taking longer...)

Any clue welcome,

Thanks,

Julien.

[2657 byte] By [balteoa] at [2007-11-27 8:53:24]
# 1

Have you tried any of the "java -X" options?

For instance, the following:

java -X

-Xms<size>set initial Java heap size

-Xmx<size>set maximum Java heap size

I found myself getting the error "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space" when parsing a big xml file, and I had to run the program with the -Xmx parameter.(java -Xmx128m -jar MyProg.jar myXML.xml)

kaderuda at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...
# 2
Thanks,Unfortunately, I have no control over the jvm. I am just a developer and cannot set the jvm options. I just thought I could do it programmatically...Thanks anyway,Julien.
balteoa at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...
# 3
...programmatically: I meant reducing memory footprint.J.
balteoa at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...
# 4

I think the XSLT and what it is transforming would determine the memory footprint for the most part.

Xalan's docs say that it can do some processing while SAX events are being read. It might be that if you use a SAXTransformerFactory (see JAXP spec) it might not automatically create a DOM before passing it to XSLT/XPath.

Within the XPath expressions, avoiding ancestor-or-self (//) would reduce the amount of memory required (conceptually). Generally, avoiding the need to have more of the infoset in memory to evaluate an expression would reduce the need for memory (conceptually).

It's a deep question that you are asking really. I've seen Michael Kay (author of Saxon and XSLT spec person and XSLT book author) talk about what types of XPath expressions he tests to make saxon faster, I think that could have some relation to the memory used by your XSLT. See:

http://saxonica.blogharbor.com/blog/_archives/2007/4/27/2908908.html

You would probably get better answers if you ask on an XSL list, e.g.:

http://www.mulberrytech.com/xsl/xsl-list/

Message was edited by:

queshaw

queshawa at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...
# 5
Queshaw,Thanks a lot for your detailed reply!!All the best,Julien.
balteoa at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...
# 6
Oops. I should have said "descendent-or-self" (//) not "ancestor-or-self".
queshawa at 2007-7-12 21:10:38 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...