How to convert Java unicode to "Shift_JIS"

Hi Everybody,

I am trying to convert Java unicode to "Shift_JIS".

I'm passing a HTML unicode to below code.

And Java unicode is getting returned.

String temp=conJavaUnicode(sTemp); // sTemp contain HTMLUnicode

public static String conJavaUnicode(String Str1)

{

int num=0;

String[] strArr={};

StringBuffer strBuff =new StringBuffer();

strArr=Str1.split(";",0);

for(int count=0;count<strArr.length;count++)

{

String str= strArr[count];

String strTemp= null;

strTemp=str.substring(str.indexOf('#')+1);

num =Integer.parseInt(strTemp);

String hex1 = Integer.toHexString(num);

strTemp="\\u"+hex1;

strBuff.append(strTemp);

}

String x=strBuff.toString();

return x;

}

The returned Java unicode I'm trying to convert to "Shift_Jis"

The following code I'm using for that.

byte[] byteShiftJis = temp.getBytes();

System.out.println("byteShiftJis" + byteShiftJis);

convJapanese = new String(byteShiftJis, "Shift_JIS");

I need to insert the variable convJapanese to the database. The databse encoding is "Shift_JIS".

But when the given code is executed I'm not getting java unicode converted to "Shift_JIS".

And the value I'm getting in that variable is javaunicode itself.

Please help me in this matter.

Thanks in advance

Alex.>

[1439 byte] By [AlexRajua] at [2007-11-27 11:14:43]
# 1

A String in Java doesn't have an encoding. It's always in Unicode, if you like to think of it that way. If you want to convert it to Shift_JIS then you'll have to convert it to an array of bytes in Shift_JIS encoding. Like this:byte[] byteShiftJis = temp.getBytes("Shift_JIS");

That's assuming that "Shift_JIS" is the correct name for the encoding, I haven't checked that.

DrClapa at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...
# 2

You seem to misunderstand the concepts of strings in Java. A string in Java is always in Unicode (UTF-16).

What you need is a string (which is in Unicode), and then code which inserts that string into your database. If the database and JDBC driver are configured correctly, then the conversion to the appropriate encoding should happen automatically when the string is inserted into the database.

In my case, using DB2, my code looks like this:

try {

Class.forName("com.ibm.db2.jcc.DB2Driver");

Connection db2Conn = DriverManager.getConnection ("jdbc:db2://localhost:50000/TEST1","db2_userid","db2_password");

PreparedStatement pst = db2Conn.prepareStatement("INSERT INTO TEST.USERS VALUES (?,?,?,?)");

pst.setString(1, uid);

pst.setString(2, uname);

pst.setString(3, countryid);

pst.setString(4, langid);

pst.execute();

pst.close();

Where 'uid' etc. are strings.

I know that other databases (MySQL) require you to specify the encoding in the connect string, so I suspect that you should focus on that part of your code.

one_danea at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...
# 3

Hi,

Thanks for the responses.

Actually in my user inteface there are two text boxes which are entered in japanese. Which I can't directly insert into databse.

So I'm converting that japanese string to HTML unicode. And in the java part I need to convert that to Shift-JIS format.

I'm trying that in the above given code.

But the conversion to Shift_JIS is not happening.

This is the problem I'm facing.

Thanks and regards

Alex

AlexRajua at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...
# 4

OK, then just let's look at what you write here:

byte[] byteShiftJis = temp.getBytes();

System.out.println("byteShiftJis" + byteShiftJis);

convJapanese = new String(byteShiftJis, "Shift_JIS");

I need to insert the variable convJapanese to the database. The databse encoding is "Shift_JIS".

Here you create a byte array, but then you immediately create a Java string again from that byte array. And you then want to insert that string in the database, expecting it to be in Shift_JIS. But as I said, a string in Java is always in Unicode. So your roundtrip through a byte array (assuming that your system default encoding is Shift_JIS) back to a string wouldn't do anything useful to help you there.

one_danea at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...
# 5

Hi ,

Thanks for the response.

I tried and I have almost solved this issue.

I'm facing another problem.I'll explain that.

temp=conJavaUnicode(sTemp);--> I'll get a Javaunicode value :"\u5bb6"

byte[] byteShiftJis = temp.getBytes();

convJapanese = new String(byteShiftJis, "Shift-JIS");

Here i'm getting the value of convJapanese=:"\u5bb6". I should not get that, I should get a value in "Shift-JIS".

But when I'm hardcoding the value of temp, the above code is working fine.

temp="\u5bb6"; --> I'm hardcoding a Javaunicode value :"\u5bb6"

byte[] byteShiftJis = temp.getBytes();

convJapanese = new String(byteShiftJis, "Shift-JIS");

Here convJapanese is getting propervalue which is the desired output.

What is the difference between hardcoding a value and getting the value dynamically from a function?

Please tell me How can I overcome this issue, as hardcoding in my application will not work?

Please help me.

Thanks and regards

Alex

AlexRajua at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...
# 6

I will repeat it a third time, to see if it makes any difference: You can't get a Java string that is "in Shift-JIS". Java strings are always in Unicode. No matter what sleazy hacks you use, it isn't going to work.

And I think you need to read this article. If you aren't going to listen to the people who are trying to give you answers then listen to an expert in the field:

http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/

DrClapa at 2007-7-29 14:08:30 > top of Java-index,Desktop,I18N...