How to determine the character encoding of a string

I'm under the impression (however misguided this may be) that one of our Databases (it is set to 8859) is outputting it's values as 8859 charset to Java, who in turn is preserving this encoding.

Printout on the contents of data.get(info) yields garbage.

However, Doing

value =new String((data.get(info).getBytes("ISO-8859-1")),"UTF-8");

and printing 'value' yields the proper Asian characters.

Is there a way to determine the Charset of a string somehow?

[581 byte] By [JeuLaFetea] at [2007-11-27 10:27:19]
# 1

wouldn't the encoding be a property of the 'data' that you're getting a byte[] from. Once the string has been initialized I don't think it preserves any more information.

What kind of object is data?

Cogsya at 2007-7-28 17:44:48 > top of Java-index,Java Essentials,New To Java...
# 2

> that one of our Databases (it is set to 8859) is outputting

> its values as 8859 charset to Java,

All the JDBC drivers I know hide the database encoding from the Java program. You can not set or influence the way the driver decodes the database data into Java characters.

It is not correct to declare the database some latin-x, and still store Asian characters within. The 8859-x encodings do not cover Asian alphabets.

BIJ001a at 2007-7-28 17:44:48 > top of Java-index,Java Essentials,New To Java...
# 3

It is unfortunate that the iso-8859-x encodings can normally be mapped 1-1 to bytes. This means that any random set of bytes can be converted to a String using an iso-8859-x encoding but then people have to go though contortions such as the OP's code to try to correct the problem.

The root cause of the problem seems to be, as the OP suspected, that the database has been setup to use iso-8859-x and the fix has to be correct the problem in the database configuration and not in Java.

sabre150a at 2007-7-28 17:44:48 > top of Java-index,Java Essentials,New To Java...
# 4

Whereas some database drivers for other languages (C, C++) make it possible to set the encoding on the client side irrespectively to the database encoding setting. So you can skew your database with those programs.

BIJ001a at 2007-7-28 17:44:48 > top of Java-index,Java Essentials,New To Java...
# 5

> Whereas some database drivers for other languages (C,

> C++) make it possible to set the encoding on the

> client side irrespectively to the database encoding

> setting. So you can skew your database with those

> programs.

Some JDBC drivers allow one to specify the encoding as part of the connection URL. This may be a general property of JDBC drivers but, since I have now retired, my JDBC knowledge is getting as dated as I am so I don't really know.

sabre150a at 2007-7-28 17:44:48 > top of Java-index,Java Essentials,New To Java...