UTF-8 to UCS 2 Encoding Conversion

Hi!

we have data with different encoding in our Oracle DB.

ie utf-8,ucs-2 ,iso etc...

We need to retrieve the data from the database and display the result.

when i m retrieving the data and doing rs.getString(ColName) its throwing exception saying that cant not convert bet' UTF-8 to UCS-2 Format.

When I do a getBytes..I am not getting any exception but the result is not in a readable form...v hv to display the data.

Any idea hw to convert bet' UTF=8 to UCS 2?

thanks in Advance!!

[536 byte] By [pooja_10a] at [2007-10-3 4:39:03]
# 1

Well, if Java can understand a character set, it's pretty trivial to convert; you just user input and output streams for bytes and readers and writers to convert the bytes to characters and vice-versa given various character sets. Or you can use the String constructor that takes a charset name. Etc.

The next issue is whether Java can understand a given character set. From the docs, as I read them, Java supports UTF-8 and UTF-16, and others, but not UCS-2 directly. But UCS-2 is a subset of UTF-16.

So from what I understand, it should be easy to convert from UCS-2 to, say, UTF-8, but the other direction might be a problem, at least if the characters are outside the basic Unicode character plane. UCS-2 couldn't support such a character.

I'm deriving this info from my reading of the java.nio.charset.Charset API docs page and this:

http://en.wikipedia.org/wiki/UTF-16/UCS-2

Perhaps I'm wrong.

So anyway if you have an array of bytes from the database, and you're sure they're in UTF-16 format, then I believe you could just do:

String s = new String(byteArray, "UTF-16");

to get a String from those bytes.

paulcwa at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 2

Hi Paul,

Thanks for the reply!

Actually we have data in different formats in DB.

I am facing problem when retrieving UCS-2 encoding data.

I tried your suggestion also....

byte[] str = rs.getBytes(i);

String ucs2String = new String (str,"UTF-16");

But the data is not in a readable form.It is displaying some junk char.

Anything else i can try out?

Thanks a million!

pooja_10a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 3

> Actually we have data in different formats in DB.

> I am facing problem when retrieving UCS-2 encoding

> data.

>

> I tried your suggestion also....

> byte[] str = rs.getBytes(i);

> String ucs2String = new String (str,"UTF-16");

>

> But the data is not in a readable form.It is

> displaying some junk char.

>

This suggests that either the bytes obtained from rs.getBytes() are not UCS-2 or that the font you are using to display the string does not have glyphs for some of your characters.

sabre150a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 4

Hi!

I have the required the font as I am able to see the records for the same language using other web site.

I concluded that the resultset we are getting is in UCS -2 format because I am getting the exception ..

java.sql.SQLException: Fail to convert between UTF8 and UCS2: failUTF8Conv

for few alnguages like Finnish,German and few more.

For other languages like English,Chinese we are not having any problem.These languages has UTF-8 encoding and we are setting the

XML encoding to UTF-8 to display the result..

xmlResultList.setEncoding("UTF-8");

Plz Note :We are migrating application from iPlanet to Tomcat.

The same code was working fine earlier.

Not Sure if we need to do a setting in Tomcat to support the encoding ...

Thanks In Advance!!

pooja_10a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 5
I don't think you can conclude the format based on that error message.Are there any sample strings the db with known values you can look at?
paulcwa at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 6

> I don't think you can conclude the format based on

> that error message.

I agree. If anything the error message suggest to me that the database does not contain valid UTF-8. What was the process that put the bytes in the database? Can you be sure it inserted valid UTF-8 sequences?

sabre150a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 7

We have same application running on iPlanet application server aslo using the same DB.

There i am able to see the correct result.

The DB is same , Code is same...only the apllication server is cahnged to Tomcat....

I really wonder how its working in iPlanet...

Do we need to make any setting for Tomcat to take care od encoding?

pooja_10a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 8

> We have same application running on iPlanet

> application server aslo using the same DB.

> There i am able to see the correct result.

This does not mean that you place UTF-8 into the database. Just that the encoding you used when inserting is the same as that used when extracting.

>

> The DB is same , Code is same...only the apllication

> server is cahnged to Tomcat....

Or the encoding used has changed.

>

> I really wonder how its working in iPlanet...

>

> Do we need to make any setting for Tomcat to take

> care od encoding?

You still have not proved (to me) that you are putting UTF-8 into the databaase. Since we don't have any of your rogue bytes to work with I don't see what we can do.

sabre150a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 9

Sorry Guys for replying late...

Have just fixed the issue.

On the server the ojdbc12.jar file was for Oracle 9i whereas we were using Oracle 10i DB.

We replaced the jar file and it start working!!

I really appreciate your help.

Thank You All!!

Regards,

Pooja

pooja_10a at 2007-7-14 22:42:57 > top of Java-index,Java Essentials,Java Programming...
# 10

> Sorry Guys for replying late...

> Have just fixed the issue.

> On the server the ojdbc12.jar file was for Oracle 9i

> whereas we were using Oracle 10i DB.

> We replaced the jar file and it start working!!

>

> I really appreciate your help.

>

> Thank You All!!

>

> Regards,

> Pooja

Thanks for posting yoru findings. Would be helpful for future references

kilyasa at 2007-7-14 22:42:58 > top of Java-index,Java Essentials,Java Programming...