Problem with file.encoding in Linux
Hello,
I am currently migrating a java project from Windows to Linux. The project is finally shaping up now, except for some encoding problems.
All configuration files are saved in ISO-8859-1/Cp1252 format. When reading and displaying these files in Swing (e.g. JTextArea), the special characters 鲣 are displayed wrong. I have tried to start the VM with -Dfile.encoding=ISO-8859-1 and -Dfile.encoding=Cp1252 without success (this is done in Eclipse under Linux).
I then tried the opposite. I created some UTF-8 files, started the application under Windows/Eclipse, read the files and displayed them in a JTextArea. Garbage characters were shown instead of 鲣 (as expected). I then used -Dfile.encoding=UTF-8, and voila, all characters were displayed correctly.
Why does not -Dfile.encoding work for ISO-8859-1/Linux but -Dfile.encoding work for UTF-8/Windows? Anyone here know?
The JRE I have been using is 1.4.2_06.
The Linux is a SuSE 10.0
[985 byte] By [
herrena] at [2007-10-2 20:19:13]

> Why does not -Dfile.encoding work for ISO-8859-1/Linux but -Dfile.encoding work for UTF-8/Windows?
I ran into this a few years ago. Apparently the intention is that file.encoding should not be directly modifiable - so the Windows implementation is a bug - but should be changed via locales. At that point, however, the documentation dried up, so I ended up using a FileReader constructor which allows you to specify the charset. If you don't want to do that, you're probably best off enquiring further on a Linux forum.
> man locale
I have looked through it many times, and keep coming to the conclusion that it was written for people who already know what they're doing and need to remind themselves of a minor detail. For example, it makes no mention of the LANG environment variable, even though that seems to be fairly central to the issue of locales.
My version must be different than yours - I didn't mean to imply you hadn't read it.
Snip...
LANGProvide a default value for the internationalization variables that are unset or null. (See the Base Definitions volume
of IEEE Std 1003.1-2001, Section 8.2, Internationalization Variables for the precedence of internationalization vari-
ables used to determine the values of locale categories.)
LC_ALL If set to a non-empty string value, override the values of all the other internationalization variables.
LC_CTYPE
Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte
as opposed to multi-byte characters in arguments and input files).
LC_MESSAGES
Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard
error.
NLSPATH
Determine the location of message catalogs for the processing of LC_MESSAGES .