InputTextArea - Converter
Hi,
Users are pasting text from Word documents into inputTextArea on my pages. Included in that text are single and double curly quotes.
After saving the field, they comeback as question marks.
How could I intercept and change those curly quotes into something I can save like straight single and double quotes.
I tried the custom converter but I don't know how to replace them. I have no clue what I'm suppose to look for in the string and what to replace it with. I tried string.replaceall() but what's the string I'm suppose to look ?
Thx in advance for your help.
[609 byte] By [
Javaaaaaaa] at [2007-11-26 14:59:37]

# 1
You'd better to doublecheck and adjust the locale and charset settings on the JSF pages, the appserver and the database, etc.
If you really want to use a converter, well, develop a small JSF webapp with a converter, play somewhat with the converter code, find out the unicode code of the curly quotes and use it in string.replace(char, char).
I guess you mean those curly quotes:
?and ?br>
The unicode code of them both is \u201c and \u201d by the way. Also see http://en.wikibooks.org/wiki/Unicode/Character_reference/2000-2FFF
With this knowledge you can use for example:
string.replace('\u201c', '"');
# 2
Thx for your help.
I can't really change anything on the DB side. The DB's been there for a long time and they won't do a conversion.
No real appserver, we're using Tomcat. All our pages are using charset iso-8859-1. I wrote my own converter and I'm trying to figure out how to convert the curly quotes into something listed in iso-8859-1 table. The thing is, when I look a the incoming String (in the debugger), the curly quotes are little squares (I'm guessing graphical characters).
So, how do I figure out what to use in the String.replaceAll(char,char)? By the way, the single quotes in string.replace('\u201c', '"'); are not allowed. They have to be double quotes string.replace("\u201c", "\"");.
If the incoming String is encoded in iso-8859-1 how do I replace an unrecognized character, which method should I use.
Thx in advance.
Message was edited by:
Javaaaaaa
# 3
> So, how do I figure out what to use in the
> String.replaceAll(char,char)? By the way, the single
> quotes in string.replace('\u201c', '"'); are not
> allowed. They have to be double quotes
> string.replace("\u201c", "\"");.
I've never seen String.replace(String, String) in the API documentation ..
It is really [url=http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#replace(char,%20char)]String.replace(char, char)[/url].
Well, try to capture this one graphical character and get the unicode code from it. You can do this with Integer.toHexString(char). Then use this unicodecode in the String.replace(char, char) method.
# 4
Thx BalusC,Integer.toHexString(char) returned: 91Now how do I replace that by a single quote?Thx
# 5
string.replace('\u0091', '\'');
# 6
Hmmmm.
This returnedString = string.replace('\u0091', 'A'); doesn't replace anything but
this one returnedString = string.replace("\u0094", "D"); does. Weird.
u0094 is double quote "
u0091 is single quote '
P.S: I used letters just to test that replacement occurred.
Any suggestion BalusC ?
# 7
Are you running Java 5.0? I now see that it supports [url= http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#replace(java.lang.CharSequence,%20java.lang.CharSequence)]String.replace(CharSequence, CharSequence)[/url].But does it work anyway?
# 8
Here's the text I'm trying to clean-up:
慳lt=攃ontent敀
In the debugger, the first single quote is \u0091 (which is the annoying MS Word smart quote). When I try to convert it to a standard single quote, it doesn't.
returnedString = string.replace("\u0091", "A");
That line doesn't replace the smart quote by an 'A'. Nor those this one:
returnedString = string.replace("?, "A");
Don't give up on me now. :)
# 10
I'm a dumb ***. I was replacing the text in the original String everytime like this:
returnedString = string.replace("\u0091", "A");
returnedString = string.replace("\u0092", "B");
returnedString = string.replace("\u0093", "C");
returnedString = string.replace("\u0094", "D");
And looking at returnedString .
That's why.
I really need one more coffee.
Thx for your help.