handling unicode characters
Hi Friends,
I am trying to send and retrieve hindi on a web page (Using struts).The characters are transferred as unicode.
I have a search text box in which I type hindi words and on click of submit I am sending the words to showing search results page where in I show something like this:
<you searched for Hindi word> This is in hindi. Just gave an example here.
Now My problem is that when I send the results back I am able to display text in hindi on the page but the text box itself on the page shows unicode characters like
विशाल
in text box and below it search results in hindi.
(This is hindi unicode for "Vishal" Just testing like that)
Actually when I right click on IE and try to see view-source my text box contains
<input type="text" name="title" value="&#2357;&#2367;&#2358;&#2366;&#2354;">
& converted to &..
Note for testing if we remove this "amp;" everwhere in the value part then I get the desired text in the text box.
Do you have any suggestions on the same?
Thanks.
Vishal
[1176 byte] By [
Vishal.MKa] at [2007-11-27 4:40:04]

# 1
How are you generating this input field?
Via a struts tag?
Yourself manually?
Something somewhere is escaping the values for you. Twice.
You only want to run the "escape" code once :-)
If you are using <c:out> try escapeXml="false"
If you are using <bean:write> try filter="false"
Hope this helps,
evnafets
# 2
> How are you generating this input field?
>
> Via a struts tag?
> Yourself manually?
>
> Something somewhere is escaping the values for you.
> Twice.
> ou only want to run the "escape" code once :-)
>
> If you are using <c:out> try escapeXml="false"
> If you are using <bean:write> try filter="false"
>
> Hope this helps,
> evnafets
Thanks for your response !
for text box i use this <html:text property='title'/> where in I type hindi.
On some pages where hindi is to be displayed at some places.
For example i have a page - which displays list of user registered.
Some entries are english and some in hindi
The out put is displayed no differently and when i do a view source it has the unicode in the same format as described earlier.
I tried this also
<bean:write name="element" property="firstname" filter="false" />
but it does not seam to help
Do you have any more suggestions.
Would appreciate that.
Thanks
Vishal
# 3
Suggestions that worked for me with struts and unicode:
* the following fragment in servlet.processRequest: response.setContentType("text/html;charset=UTF-8");
* the following fragment in formBean.validate, and formBean.reset: request.setCharacterEncoding("UTF-8");
* adding URIEncoding="UTF-8" to the connector element in your app server's server.xml
* reading the following: http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
http://www.roseindia.net/struts/
* HTTP header inspection
Now I can insert and search text in chinese, arabic, hindi and everything else I try. I think it's remarkable that unicode is not default.
# 4
Thanks a lot. it helped!Vishal
# 5
I have a new issue:
My database table I have set to CHAR SET to utf-8.
Now when I send words (hindi unicode) for search in the DB :
1) My word is sent in some typical format. Because When I try to print a SOP like :
System.out.println("Search string received is = "+ searchForTitle);
It prints
Search string received is = ?
on the console.
2) So if I execute my query now as select * from <table> where name=<searchForTitle>
"searchForTitle" is the string for which I have to query the table.
It fails giving 0 results.
My Question is what is the proper way to search a table in this case.
My table cintains charcters stored in unicode format for example a record has :
14, 'कल', 'कल'
Here I am sending hindi word to be search for.
Would appreciate your suggestions.
Thanks
Vishal
# 6
When the console isn't utf-8, its output can look like junk at the same time all use cases work correctly with unicode and utf-8 for the user. Try set everything to unicode and utf-8. The one suggestion I forgot to mention before was the beginning of the jsp:
<%@page pageEncoding="UTF-8"%>
Add useUnicode=true&characterEncoding=UTF-8 to your db connection string if you haven't. My jdbc connection string looks like to following: jdbc:mysql://localhost/database?user=java&password=javajava&useUnicode=true&characterEncoding=UTF-8
# 7
1)
> When the console isn't utf-8, its output can look
> like junk at the same time all use cases work
> correctly with unicode and utf-8 for the user. Try
> set everything to unicode and utf-8. The one
> suggestion I forgot to mention before was the
> beginning of the jsp:
> <%@page pageEncoding="UTF-8"%>
>
> Add useUnicode=true&characterEncoding=UTF-8 to your
> db connection string if you haven't. My jdbc
> connection string looks like to following:
> jdbc:mysql://localhost/database?user=java&password=jav
> ajava&useUnicode=true&characterEncoding=UTF-8
I added to the JSP
<%@ page language="java" pageEncoding="UTF-8" contentType="text/html; charset=utf-8"%>
and my connection string is now
jdbc:mysql://localhost:3306/myDB?relaxAutoCommit=true&useUnicode=true&characterEncoding=utf-8&
But still I get zero results from the table
I have this क in my title column. which is unicode for क
and when I type क in my text box it (query) gives my zero results.
This is my query select * from books where title = ? (I am not sure what is inserted in front of the = sign here)
I am using hibernate also - I don't think that should create any problems.
2)
Also what about the values which come from a database.
My Struts textboxs have &ersand#2325
at the backend while क in front end.
ampersand=amp;
I solved the issue the bean:write tags by using filter="false".Is there any similar way for text boxes also because filter attribute is not available for struts html:text tags.
# 8
I can't think of more suggestions that can't be found here: http://www.whirlycott.com/phil/2005/05/11/building-j2ee-web-applications-with-utf-8-support/
I hope that you are successful.
I inserted क in my database and could search for क. If you want you can email me niklas_stockholm@yahoo.com and I'll send you a link to my example and code.
Good luck
# 9
Please try to post relevant code snippets right in here. Otherwise in the upcoming years this topic will be doomed by 100s users who are spamming their mail address in order to retrieve the code ;)
# 10
> I can't think of more suggestions that can't be found
> here:
> http://www.whirlycott.com/phil/2005/05/11/building-j2e
> e-web-applications-with-utf-8-support/
> I hope that you are successful.
>
I appreciate your concern and help!
Finally, I added the acceptCharset="UTF-8" in my struts form which is equivalent of accept-charset="UTF-8" in HTML.
That seemed to have worked. Without this I guess the characters were getting inserted into the table as numeric unicodes and which resulted into a 0 results of search query.
> I inserted क in my database and could search
> for क. If you want you can email me
> and I'll send you a link
> to my example and code.
This क is wrong here since I wanted to rite the unocide number but because of the priview it got converted into this. Anyway this is a different story.
> Good luck