Reading a byte (> 127) from a file

I've never dealt much with file IO in Java except for text files, and I've run into a snag.

I'm trying to parse files with a horrible format. They have a mixture of ASCII and EBCDIC. There is a fixed-length run of arbitrary bytes. And some records are delimited front and back with one-byte record-length indicators.

It is that last which is giving me problems. At a certain point in the file, I need to read a single byte to determine the length of the next record, read that many bytes into the record, and then read another byte which should match the first. This works fine if the record length is 127 or less, but when it gets longer, I no longer read the length properly. The first record where this happens has a length byte of 137 == 0x89. It seems to read (as an int) the value 8240 == 0x2030, and when I make attempts at casting, I usually end up with the value 48 == 0x30. (If the context is important, the previous byte, which seems to read properly as the end-delimiter of the previous record is 0x36, and the next two are both 0x40.)

I played around with the old "& 0xff" bit, but that didn't help anything. I think it has to do with file encoding. (I'm on Windows with the default file encoding, "Cp1252".) I've done my usual IO routine of wrapping aFileReader in aBufferedReader, and I'm wondering if something with aFileInputStream would help, or whether there is some encoding that would let me read this byte properly. I've never even tried to set the encoding, but am willing to try anything once. :-)

Is there anyone able to offer help?

How about solace? :-)

Thanks,

-- Scott

[1696 byte] By [CrossEye] at [2007-9-26 18:13:22]
# 1
You don't want to use Readers to read bytes. A Reader reads characters (not bytes) according to a given character encoding (not raw bytes). You want to use an InputStream to read the raw bytes.
schapel at 2007-7-3 2:04:51 > top of Java-index,Archived Forums,Java Programming...
# 2

You should use a byte reader to read the input file (this will work with either a text or a binary file). I use the following snipet of codes to read a file (regardless of mimetype) from the web and save it on the user's disk:

BufferedInputStream reader=new BufferedInputStream(in,4096);

FileOutputStream out=new FileOutputStream(outputFile);

BufferedOutputStream writer=new BufferedOutputStream(out,4096);

byte[] buf=new byte[4096];

int byteRead;

while ((byteRead=reader.read(buf,0,4096))>=0) {writer.write(buf,0,byteRead);}

reader.close();

writer.flush();

writer.close();

V.V.

viravan at 2007-7-3 2:04:52 > top of Java-index,Archived Forums,Java Programming...
# 3

> You don't want to use Readers to read bytes. A Reader

> reads characters (not bytes) according to a given

> character encoding (not raw bytes). You want to use an

> InputStream to read the raw bytes.

I'm afraid that's what I'll have to do. But all those easy-to-read CR-LF delimited ASCII lines in the file were so easy with the Reader.

I suppose I can rebuild that part of the program reading a byte at a time... Will I need to duplicate the work of InputStreamReader? Will I need to use sun.io.ByteToCharConverter? Will I need to find a better job that doesn't have all these headaches?

Thanks,

-- Scott

CrossEye at 2007-7-3 2:04:52 > top of Java-index,Archived Forums,Java Programming...
# 4

You should be able to do like this:

InputStream bytes_in=new BufferedInputStream(new FileInputStream(myfile));

Reader chars_in=new InputStreamReader(bytes_in);

//now read chars from chars_in and bytes from bytes_in

However, you can't create a BufferedReader with chars_in because that would fill its buffer with data that you want to read from bytes_in.

- Marcus

msundman at 2007-7-3 2:04:52 > top of Java-index,Archived Forums,Java Programming...
# 5

> You should be able to do like this:

> InputStream bytes_in=new BufferedInputStream(new FileInputStream(myfile));

> Reader chars_in=new InputStreamReader(bytes_in);

> //now read chars from chars_in and bytes from bytes_in

Yes, that is what I'm working on now. I have fixed the problem that prompted my post. Now I have to go back and get the other parts that I had working before working again. Shouldn't be too hard, just tedious.

Thank you everyone for your help.

-- Scott

CrossEye at 2007-7-3 2:04:52 > top of Java-index,Archived Forums,Java Programming...