XML transformation and dtd problem

Hello everybody!!

I am using Xalan's Transformer Factory to transform an xml to another xml ...

The thing is i am receiving an xml with a dtd definition in the doctype , but this dtd is not public, and it's local. i tried to ignore this definition but i didn't find out how. So i created a dtd for this xml , and it's ok, but now i get "[org.xml.sax.SAXParseException: Content is not allowed in prolog.]",

this is when the parser finds the declaration:

<?xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE foo SYSTEM "def.dtd">

why does this happen? any ideas ?

[def.dtd is accessible by the parser]

[663 byte] By [Mackleina] at [2007-11-27 5:55:43]
# 1
my bad, the problem is not the doctype definition, i removed the tag and still the error, seems to be some problem with the file charset encoding
Mackleina at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 2
That often means you have whitespace before the <?xml bit.
DrClapa at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 3

i checked the file, even with hex editor, and there is nothing before <?xml, the thing is , the file comes from internet gzip encoded, so i use this for reading from internet and writing the file:

//urlocn is a HttpURLConnection

BufferedReader rd=new BufferedReader(new InputStreamReader(new GZIPInputStream(urlcon.getInputStream()),"UTF-8"));

//outsream is a FileOutputStream

BufferedWriter fw=new BufferedWriter(new OutputStreamWriter(outstream,"UTF-8"));

I still get the error, and when i open the file in a text editor it says the coding is ANSI.........

should i not use BufferedReader and go with bytes Streams?

Mackleina at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 4

When you're dealing with XML parsing it's always a good decision to use streams rather than readers. XML is designed for auto-detection of encodings. But that isn't likely to be your problem.

I don't understand how you could check the file with a hex editor when you don't even have a file. Is it possible the input stream starts with a byte-ordering mark or something like that?

DrClapa at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 5

> I don't understand how you could check the file with

> a hex editor when you don't even have a file.

I didn't put the whole code, this is what i do next:

String line;

while ((line = rd.readLine()) != null){

fw.write(line);

fw.newLine();

fw.flush();

}

Mackleina at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 6

i tried Streams:

GZIPInputStream gzst=new GZIPInputStream(urlcon.getInputStream());

int len;

byte[] buf = new byte[1024];

outstream =new FileOutputStream(new File("temp.xml"));

while ((len = gzst.read(buf)) > 0) {

outstream.write(buf, 0, len);

}

:( still getting :

javax.xml.bind.UnmarshalException

- with linked exception:

[org.xml.sax.SAXParseException: Content is not allowed in prolog.]

at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(Unknown Source)

at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(Unmars

hallerImpl.java:481)

at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.jav

a:203)

at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java

:172)

at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)

at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Source)

Mackleina at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 7
oH OK, it works now, using streams works fine!!
Mackleina at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...
# 8
How did you solve it?
Dalzhima at 2007-7-12 16:24:56 > top of Java-index,Java Essentials,Java Programming...