new line in Big Endian format
How do I represent a new line character in Big-Endian format ? I have a third party utility which reads a text file in Big Endian format. I am creating this text file with one word per line kind of format. When I try to add a new line
using
FileOutputStream fs = new FileOutputStream(fileName);
PrintStream pos = new PrintStream(fs, true,"UTF-16BE");
inside a while loop(
// get each word and write to file
pos.print(word);
********* I want to add a new line to this file after every word.
}
If I use a '\n' its creating wierd characater inside the file
If I use the property line.seperator, its causing the same as above
println also gives the trouble
I am looking for the new line string or char in Big Endian format
Any ideas ? Thanks in advance
[840 byte] By [
kurmata] at [2007-11-27 4:34:42]

You can usepos.println(word);but this will be just the same as writing a "\n". What is the character code of the 'wierd characater' ?
When I view the file in notepad its a rectangle
try the code and open the output file in a notepad
try{
FileOutputStream fs = new FileOutputStream("test.txt");
PrintStream pos = new PrintStream(fs, true,"UTF-16BE");
pos.println("wordone");
pos.println("wordtwo");
}catch(Exception e){
e.printStackTrace();
}
Notepad is not a good editor and it not good for looking at bytes. Find an open source hex edit/display utility and then look at the value. I'm betting that the square has byte value of zero.
The issue is the third party utility does not recognise the file as a Big Endian format. They have given me a sample file which does not have this weird character and it is human readable
Open the sample file in a hex editor and investigate how the characters have been encoded.
give %n a try instead of \n
The rectangle is displayed in notepad, cause by default it handles ANSI character encoding.
In ANSI, if I'm not mistaken a \n is represented by Carriage Return(CR) as a character.
While in unicode \n is a combination of a Carriage Return(CR) and Line Feed(LF) which shows up as a rectangle.
I messed it up , the write value between two words is00 0D 00 0A 00 41
That's "\r\nA" encoded in UTF-16BE.Have you tried this?pos.print("word1\r\nAword2");
I tried with %n and I am getting the literal likewordone%nwordtwo%n
> In ANSI, if I'm not mistaken a \n is represented by
> Carriage Return(CR) as a character.
> While in unicode \n is a combination of a Carriage
> Return(CR) and Line Feed(LF) which
> shows up as a rectangle.
I'm afraid you are mistaken: '\n' is the line feed character (LF) and '\r' is the carriage return character (CR). They are single characters, neither is a combination of two other characters. There is no difference between Unicode and ANSI ASCII here.
And besides, it is the line feed character that appears as a block in MS Notepad if a carriage return character doesn't precede it. The only line separator that MS Notepad accepts is "\r\n".
it brings back the same two rectangles with a literal A
pos.format("testing%none%ntwo%nthree%n");
> it brings back the same two rectangles with a literal
> A
Then you must re-examine the sample file. You have missed something.
Look at the first few bytes. What are they? Maybe the sample file begins with a BOM (byte order marker) that you are not including in your file.
The difference between the orginal file and the file I am creating is the initial hex letters on the file provided by the vendor are FE FF and mine does not contain those characters
That's the BOM. If you used UTF-16 it would be included automatically, but when you use UTF-16BE it's not. So either change UTF-16BE to UTF-16, or write the string "\uFEFF" to the stream before you write anything else.
More info on the byte order mark can be found at
http://en.wikipedia.org/wiki/Byte_Order_Mark
How do I add the hex FE FF to the start of my file
Wow, that fixed it. I addedpos.print("\uFEFF")Thank you folks.
> That's the BOM.I'm just not feelin' it dawg. Ahem...
You folks saved my day. Thanks forum