Identify malformed string in modified UTF-8

I am getting java.io.UTFDataFormatException when I am reading Serialized data from a file.

The data file contains objects that had their data populated by a user through a Swing GUI.

I am assuming that the user either copy and pasted some weird text into a text box and saved it.

I am wondering if it is possible to identify what bytes in the file are causing the exception?

Thank you,

Al

[427 byte] By [alzoida] at [2007-11-27 0:09:12]
# 1
I would assume in this condition that the byte stream is not encoded as utf-8. Are you aware wha encoding the client is using when creating this stream?
kilyasa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 2
The stream was written with ObjectOutputStream. The user had the ability to put their own String data into the object then click "Save' which writes it to a file.I am assuming that they entered some strange data which is causing the Exception.
alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 3
String in Java is always encded as UCS2 , so there is no other possibility however bytestreams can have encoding and thats what you need to find out, encoding of the incoming bytestream
kilyasa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 4
I used FileReader.getEncoding() and it returned CP1252.I took another working data file from the application and it had the same encoding.Is there anything else I can look for that would be causing this error?
alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 5
If the stream was written with ObjectOutputStream the only way you can read it is with an ObjectInputStream.
ejpa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 6
I get the error when I try to read with an Object from the file with ObjectInputStream. I used a FileReader so I could get the encoding of the file.
alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 7
Do not post the relevant portion of code where you read the data. Let us guess!
BIJ001a at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 8

The code where I read the data is very basic. I don't think the problem is with the code since it has not changed in 2 years. I am getting a java.io.UTFDataFormatException on this line

data = (Hashtable<Integer,RowData>)oisData.readObject();

According to the Docs I am getting this error because the InputStream encountered a malformed modified UTF-8 string.

I know that there is some malformed data in the file. If I had a way to identify that data I might be able to figure out how it got there or how to remove it.

If you think the code that is throwing the exception will help you here it is:

try {

//read data

fisData = new FileInputStream(dataFile);

oisData = new ObjectInputStream(fisData);

data = null;

data = (Hashtable<Integer,RowData>)oisData.readObject();

if (data == null) {

data= new Hashtable<Integer,RowData>();

}

fisData.close();

oisData.close();

} catch (FileNotFoundException fnf) {

ErrorLog errLog = new ErrorLog(fnf);

} catch (ClassNotFoundException cnf) {

ErrorLog errLog = new ErrorLog(cnf);

} catch (IOException ioe) {

ErrorLog errLog = new ErrorLog(ioe);

}

alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 9
And how are you writing this object?NB using a FileReader on this file to determine its encoding is meaningless. It doesn't have an encoding as it isn't a UTF file. It's a serialized stream.
ejpa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 10

I am writing the object in a similar fashion:

the 'data' variable is a Hashtable.

try {

fosData = new FileOutputStream(FILE_NAME);

oosData = new ObjectOutputStream(fosData);

oosData.writeObject(data);

fosData.close();

oosData.close();

fosData = null;

oosData = null;

}catch (FileNotFoundException fnfe) {

ErrorLog errLog = new ErrorLog(fnfe);

} catch (IOException ioe) {

ErrorLog errLog = new ErrorLog(ioe);

}

alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 11
>fosData.close();>oosData.close();Delete the first line. You must always close the outermost nested stream, and that closes all the others. In this case not closing oos means that you aren't giving oos a chance to flush().
ejpa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 12
Do you think this could have caused the malformed data? Is it possible to recover the data?
alzoida at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...
# 13
The data is truncated. No real way to recover.
ejpa at 2007-7-11 16:09:05 > top of Java-index,Core,Core APIs...