utf8 & database problems

I'm working on software wich must support all possible languages. We have UTF-8 database wich is accessed via jdbc80520 driver, problem is that when fetching Strings they are UTF-8 encoded and need to be converted back to 'java strings'. Some examples suggest using following technique:

String coded = (get data from database)

String decoded = new String(coded.getBytes(), "UTF8")

Problem is that this doesn't work as expected, some characters are not properly translated and are presented as 0x3F in bytearray. String constructor seems to end string there when first 0x3F occurs so result is part of fetched string. I have already tried to making own getbytes .. but It didn't seem to work any better, actually it worked even worse.

Same problem occurs when with servlets and using UTF-8 as page encoding an gettin parameters (from a form). But once we configured orion (servlet engine) to use UTF-8 as default encoding, decoding request parameters was not needed anymore since req.getparameter gives correct string. Unfornattely database still gives UTF-8 decoded strings and it must be decoded ...

All information, solutions, links etc. are welcome :)

[1204 byte] By [marko_k] at [2007-9-26 4:36:34]
«« NT user
»» Java XML
# 1

hi marko,

Your code seems to be OK. What is the default encoding of your database?

If it is already UTF-8 u donot need to encode it to UTF 8 again. However if it is something else like ISO-8859-1 only in that case u need to encode it.

Also it depends what was the original encoding of the data being stored in the database. In case it was something like big5 where u r storing some eastern languages. Then once it is encoded to UTF 8 u might loose some data. Which u might not be able to retrieve again.

So rule of thumb is to change the default database encoding to the one in which the original data is being stored and once retrieving from database u can encode it as per requirements. Code would be the same as u r using right now.

Khurram

kilyas at 2007-6-29 17:54:41 > top of Java-index,Desktop,I18N...