Problem While Displaying Chinese char
Hi
I have some Chinese chars which I am storing in db.
My approach is as follows
Step 1:
I convert file containing Chinese chars into Unicode file using native2ascii tool as shown below
native2ascii Chinese_char.txt UniChinese_char.txt
Step 2:
Now I copy Unicode chars from converted UniChinese_char.txt and past in my program as shown below.
String uniStr="\u6c49\u5b57/\u6f22\u5b57\u4e0d\u6b63\u786e"; //string copied from file converted by native2ascii tool
String fileStr="";
FileInputStream f=new FileInputStream("c:/ UniChinese_char.txt "); //read file converted by native2ascii tool
InputStreamReader sr=new InputStreamReader(f,"GBK");
BufferedReader bf=new BufferedReader(sr);
System.out.println(sr.getEncoding());
String data="";
while((data=bf.readLine())!=null)
{
fileStr=data;//store chars from file converted by native2ascii tool in fileStr variable
}
byte[] utf8Bytes = fileStr.getBytes("GBK"); //covert chars into GBK
String result=new String(utf8Bytes,"UTF8"); //covert chars into UTF8
System.out.print(uniStr+" "+result);
It displays uniStr string properly i.e. in Chinese format.
But problem arises when I try to read UniChinese_char.txt file with the help of FileInputStream as shown program above.
It displays Unicode string whenever it should display Chinese format of Unicode string.
i.e. when I print result variable which contains Unicode chars stored by string in Unicode format
it shows Unicode chars not Chinese format of Unicode string.
My file.encoding is GBK and my user.language is zh.
I have already set font of OS as Chinese.
I also tried to convert Chinese char file by utf8 encoding option with native2ascii tool, but it was useless.
Thanks for your help.
I'll just comment on one place that seems clearly wrong:
> byte[] utf8Bytes = fileStr.getBytes("GBK"); //covert
> chars into GBK
>
> String result=new String(utf8Bytes,"UTF8"); //covert
> chars into UTF8
>
> It displays Unicode string whenever it should display
> Chinese format of Unicode string.
>
> i.e. when I print result variable which contains
> Unicode chars stored by string in Unicode format
>
> it shows Unicode chars not Chinese format of Unicode
> string.
What you do in those 2 lines of code is to create a byte array in GBK encoding, and then create a new string from that byte array, but stating that the byte array is in UTF8.
Your comment "//convert chars into UTF8" seems to indicate that you think the code in that line performs a conversion INTO UTF8 - it does no such thing. A string in Java is always Unicode, and the code line in question takes the bytes and convert them FROM UTF8 into Unicode. Since the byte array in question is presumably in GBK (if they were correctly encoded to begin with), the result would be wrong.
I don't know what you mean by "it shows Unicode chars not Chinese format of Unicode string", but I would not expect the characters to display correctly.
Yes I stored chars from original Chinese file (Chinese_char.txt) into oracle db and program stores Chinese chars perfectly.
But problem occurs when I try to read same chars using ResultSet from db it shows junk chars.
Code which I used to store original chars are
private void createStatementFromTxtFile(String query)throws Exception
{
pstmt = (oracle.jdbc.OraclePreparedStatement)conn.prepareStatement(query);
pstmt.setFormOfUse(1, oracle.jdbc.OraclePreparedStatement.FORM_NCHAR);
String fileData=readFromFile(txtFile);//call to readFromFile method
pstmt.setString(1,new String(fileData.getBytes(),擨SO-8859-1?);
pstmt.execute();
pstmt.close();
System.out.println("String "+fileData+" Stored In DataBase Successfully.");
}
private String readFromFile(String txtFile)throws Exception
{
InputStream str=new FileInputStream(txtFile);
InputStreamReader in=new InputStreamReader(str);
int c;
String data="";
while ((c = in.read()) != -1)
{
data=data+(char)c;
}
in.close();
str.close();
return data;
}
Code which I used to display db Chinese chars are
private void displayTable(String query)throws Exception
{
try
{
pstmt = (oracle.jdbc.OraclePreparedStatement)conn.prepareStatement(query);
ResultSet rset = pstmt.executeQuery();
String name = "";
while(rset.next())
{
name = rset.getString(1);
System.out.println("The Table Data Is :"+name);//here name contains junk chars
}
}
catch (SQLException sqe)
{
sqe.printStackTrace();
}
}
Thanks
I see that you use ISO-8859-1 when you store your Chinese characters in the DB, and that would seem to be an issue (although I have seen statements that this kind of workaround is actually recommended by Oracle). Since you say the characters are stored correctly, it seems to work.
I don't know specifics about Oracle, so I can't really speak to the code you use to fetch the results, but maybe you need to specify an encoding in the connect statement?
But as I said, I have no experience with oracle.