Converting non-English characters to their unicode representation

I have series of files/templates where each contains a locale specific language such as Chinese, Japanese or German. I need to find out how do I get their unicode representations so I can send as html formatted email?

I can already send one for the English template as html formatted email w/out a problem. I was able to find a sample of unicode representation of Japanese and send that as a test. But how do I get the temaplates that I have and convert their contents into unicode?

Thanks in advance.

please dis-regard. I figured it out.

chehrehk

[579 byte] By [chehrehka] at [2007-10-3 2:41:34]
# 1

You need to know what character encoding was used for the template text.

For example, you could have Japanese text encoded using UTF-8 or

encoded using ISO-2022-JP and the same Japanese characters would

be represented as a different sequence of bytes. Without knowing which

charset was used, you won't be able to convert the byte sequence back

into Unicode characters (e.g., to store in a Java String).

If you do know which charset was used, java.io.Reader will convert the

byte stream into Unicode characters.

If the charset information is not available, there are heuristics that you

can use to try to guess the correct charset, but by their nature they're

going to be wrong sometimes.

bshannona at 2007-7-14 19:40:13 > top of Java-index,Enterprise & Remote Computing,Enterprise Technologies...