new line in Big Endian format

How do I represent a new line character in Big-Endian format ? I have a third party utility which reads a text file in Big Endian format. I am creating this text file with one word per line kind of format. When I try to add a new line

using

FileOutputStream fs = new FileOutputStream(fileName);

PrintStream pos = new PrintStream(fs, true,"UTF-16BE");

inside a while loop(

// get each word and write to file

pos.print(word);

********* I want to add a new line to this file after every word.

}

If I use a '\n' its creating wierd characater inside the file

If I use the property line.seperator, its causing the same as above

println also gives the trouble

I am looking for the new line string or char in Big Endian format

Any ideas ? Thanks in advance

[840 byte] By [kurmata] at [2007-11-27 4:34:42]
# 1
You can usepos.println(word);but this will be just the same as writing a "\n". What is the character code of the 'wierd characater' ?
sabre150a at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 2
When I view the file in notepad its a rectangle
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 3

try the code and open the output file in a notepad

try{

FileOutputStream fs = new FileOutputStream("test.txt");

PrintStream pos = new PrintStream(fs, true,"UTF-16BE");

pos.println("wordone");

pos.println("wordtwo");

}catch(Exception e){

e.printStackTrace();

}

kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 4
Notepad is not a good editor and it not good for looking at bytes. Find an open source hex edit/display utility and then look at the value. I'm betting that the square has byte value of zero.
sabre150a at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 5
The issue is the third party utility does not recognise the file as a Big Endian format. They have given me a sample file which does not have this weird character and it is human readable
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 6
Open the sample file in a hex editor and investigate how the characters have been encoded.
jsalonena at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 7
00
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 8
Sorry, 00 0D 00 0A 00
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 9

give %n a try instead of \n

The rectangle is displayed in notepad, cause by default it handles ANSI character encoding.

In ANSI, if I'm not mistaken a \n is represented by Carriage Return(CR) as a character.

While in unicode \n is a combination of a Carriage Return(CR) and Line Feed(LF) which shows up as a rectangle.

ArcherKinga at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 10
I messed it up , the write value between two words is00 0D 00 0A 00 41
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 11
That's "\r\nA" encoded in UTF-16BE.Have you tried this?pos.print("word1\r\nAword2");
jsalonena at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 12
I tried with %n and I am getting the literal likewordone%nwordtwo%n
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 13

> In ANSI, if I'm not mistaken a \n is represented by

> Carriage Return(CR) as a character.

> While in unicode \n is a combination of a Carriage

> Return(CR) and Line Feed(LF) which

> shows up as a rectangle.

I'm afraid you are mistaken: '\n' is the line feed character (LF) and '\r' is the carriage return character (CR). They are single characters, neither is a combination of two other characters. There is no difference between Unicode and ANSI ASCII here.

And besides, it is the line feed character that appears as a block in MS Notepad if a carriage return character doesn't precede it. The only line separator that MS Notepad accepts is "\r\n".

jsalonena at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 14
it brings back the same two rectangles with a literal A
kurmata at 2007-7-12 9:44:39 > top of Java-index,Java Essentials,New To Java...
# 15
pos.format("testing%none%ntwo%nthree%n");
Hippolytea at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 16

> it brings back the same two rectangles with a literal

> A

Then you must re-examine the sample file. You have missed something.

Look at the first few bytes. What are they? Maybe the sample file begins with a BOM (byte order marker) that you are not including in your file.

jsalonena at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 17
The difference between the orginal file and the file I am creating is the initial hex letters on the file provided by the vendor are FE FF and mine does not contain those characters
kurmata at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 18

That's the BOM. If you used UTF-16 it would be included automatically, but when you use UTF-16BE it's not. So either change UTF-16BE to UTF-16, or write the string "\uFEFF" to the stream before you write anything else.

More info on the byte order mark can be found at

http://en.wikipedia.org/wiki/Byte_Order_Mark

jsalonena at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 19
How do I add the hex FE FF to the start of my file
kurmata at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 20
Wow, that fixed it. I addedpos.print("\uFEFF")Thank you folks.
kurmata at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 21
> That's the BOM.I'm just not feelin' it dawg. Ahem...
Hippolytea at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...
# 22
You folks saved my day. Thanks forum
kurmata at 2007-7-21 21:08:36 > top of Java-index,Java Essentials,New To Java...