Why native2ascii is embedding \ufeff in start of properties file

Hi,

I am new in java internationalization world and I got astonished that it is not possible to use non-ANSI encoded property files. Any how, I tried to use native2ansi utility to convert UTF8 encoded property file to ANSI using this command:

>>native2ascii -encoding UTF-8 MessageBundle-UTF8.properties MessageBundle_en_US.properties

In response I got an ANSI encode file with this text:

\ufeffTitle = Window Title

Now after setting the locale to English US, I tried to access it using this java code:

Locale sysLocale = Locale.getDefault();

ResourceBundle messages = ResourceBundle.getBundle("MessageBundle",sysLocale);

System.out.println(messages.getString("Title"));

This code throws an exception "Can't find resource for bundle java.util.PropertyResourceBundle, key Title" when it tries to execute messages.getString("Title");

However, if I remove \ufeff from the start of my text in properties file, the code works fine. \ufeff was not added in the original UTF8 file and was generated by native2ascii.

Can you guys plz tell me, what is wrong.

Thanks

Saqib

[1154 byte] By [msaqibsaeeda] at [2007-10-3 7:52:52]
# 1

What you are seeing is the BOM (byte order mark), and if you are seeing it in the result file from native2ascii, then the BOM was definitely also there in the original file (the reason you are seeing it now is that native2ascii inserted the \u notation in front of it). So native2ascii did not generate anything.

Try looking at your source file in a hex editor, then you will see the BOM.

You probably saved your file in Notepad or another application that automatically inserts a BOM in UTF-8 files. Use a proper editor to save your files (one that does not insert a BOM), and you won't have the problem.

one_danea at 2007-7-15 2:55:03 > top of Java-index,Desktop,I18N...
# 2

Have you edited your properties file with the Notepad in Windows, or some other MS software? In that case, it's not 'native2ascii' but 'Notepad' that inserts the '\ufeff'.

Notepad automatically adds a Byte-Order-Mark (cf. http://en.wikipedia.org/wiki/Byte_Order_Mark) in the very first of the file contents (which is actually not necessary in UTF-8 encoding). That's why you see '\ufeff' when it is converted to ISO 8859-1 properties file.

Please use a text editor that do not automatically insert the BOM when editing UTF-8 text.

Naoto

naotoa at 2007-7-15 2:55:03 > top of Java-index,Desktop,I18N...
# 3
Rather than using native2ascii, you could also try our little utility at: http://www.isocra.com/encoder/Here you can paste in the text in UTF-8 and it'll give you back the escaped text.Regards,Denis
denishowa at 2007-7-15 2:55:03 > top of Java-index,Desktop,I18N...