writing/reading ASCII file with CR

I have to create a file (and later read in one which follows also the following restrictions) with the following coding:

"The files defined in this document should be sent using the ASCII character set.Within segments the pipe character is used as the delimited (value &7C).Each record is terminated by a Carriage Return character (value &0D). "

I do it like:

byte pipe[] = new byte [] { (byte) 124}

byte cr[] = new byte [] { (byte) 13 }

String pipeString = new String( pipe, 0, 1, "UTF-8")

String crString = new String( cr, 0, 1, "UTF-8")

and e.g. the crString into the file after every record.

Does that look allright to you?

The input to my file comes from another file that I read in

which holds german umlaute(?

This is why write it with the UTF-8 encoding.

Is that good enough or do I have to look for every umlaut in my input file and then change it using something like

byte A_Umlaut[] = new byte [] { (byte) 196 }

to follow ASCII convension?

Eventually I will get back a file which uses the same encoding.

I would love to read it in record after record.

Obviously I cannot read it (with any Java Reader) using readLine()

as it doesn't hold the linefeed character, correct?

Do I have to read it byte by byte looking for the cr?

What would be the best way for that (the file will be very big)?

Thanks,

Richy

[1463 byte] By [brichya] at [2007-10-2 7:33:09]
# 1

> Does that look allright to you?

No. UTF-8 is not ASCII.

Anyway, you want a pipe and CR string...

String pipe = "|";

String CR = "\r";

or

char pipe = '|';

char CR = '\r';

> The input to my file comes from another file that I read in

> which holds german umlaute(?

>This is why write it with the UTF-8 encoding.

Okay ... but your requirements say ASCII, and that's not ASCII. ASCII only goes from 0-127. Those characters are NOT in the ASCII charset. They are in Latin-1 (ISO-8895-1), or do you mean extended ASCII?

> Obviously I cannot read it (with any Java Reader) using readLine()

> as it doesn't hold the linefeed character, correct?

No, readLine() will read lines that terminate in \r, \n or \r\n... Read the API docs, it says it right there.

bsampieria at 2007-7-16 21:13:22 > top of Java-index,Java Essentials,Java Programming...
# 2

You have a problem! Not all UTF8 character can be represented in ascii so as it stands you can't meet the requirements unless you in some way encode all the non-ascii characters.

To read and write ASCII you can use a Reader and Writer along the lines of

BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("your file"),"ASCII"));

and it's equivalent Reader BufferedReader constructed using the same pattern.

You then have to decide how to encode the non-ascii characters so that they can be stored as ascii .

p.s. the pipe character is just '|' and the cr character is just '\r' .

sabre150a at 2007-7-16 21:13:22 > top of Java-index,Java Essentials,Java Programming...