Preserve single quote in text data.

I have text data that is read into a BurrefedReader, some text contain single quoutes such as Ray's. When I retieve the data instead of seeing the single quote I see a square box replacing the single quote. What can I do in java to preserve the single quote?Thanks.
[280 byte] By [FAT-BOYa] at [2007-11-26 21:28:28]
# 1

That's not a single quote like the one you put in your post. It's something else. Almost certainly one of those things that Microsoft jokingly calls "smart quotes".

To make it display properly, make sure you read your text data using the same encoding as the file actually uses. And use a font that can render the character properly.

You may need to do some debugging to see how that character is represented in the file and how it is represented in your Java code. Post again if that information confuses you.

DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 2

Thanks for the reply, I'm new to java so I'm not actually certain where to set the proper encoding. Below is my code, hopfully you can point me in the right direction. Bsically I'm grabbing data from CLOB field.

public static String ClobToString(Clob cl) throws IOException, SQLException

{

if (cl == null)

return "";

StringBuffer strOut = new StringBuffer();

String aux;

BufferedReader br = new BufferedReader(cl.getCharacterStream());

String ls =System.getProperty("line.separator");

while ((aux=br.readLine())!=null)

strOut.append(aux+ls);

return strOut.toString();

}

Again, many thanks.

FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 3
I assumed the input was from a file. If it's from a database then to start with you have to assume the database is configured correctly and it's handling that character correctly. Did you find out how it's represented in your Java code? (Hint: System.out.println((int)
DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 4
Excuse me , but I really new to java, how would I determine the encoding of ther character? Does it pertain so a single character or characters? Can you please explain more....Thanks.
FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 5
Hi there, Also, you might find this article useful: http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/index.html
appy77a at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 6

Tanks, but I'm reading my data in a desk top application, do you have examples for this? Just to point out what I'm doing I have a java web service, the reads the data from a database, then build an XML document to be sent to my application. When debugging my java code, I can examine the text visually and see the apostrophe, but it's when the text moves to .net and displays I see the odd chracter. Again I'm new and trying to get a grasp on this, appreciate the help.

Thanks.

FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 7

Okay, you're in over your head here. And nobody has time to bring you up to speed because there's a lot for you to learn. Read the document that appy77 linked to. Read this one too:

http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html

Google for other tutorials with the keywords "java unicode" and read them too.

DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 8

The basic idea as DrClap mentioned above is that if the character encoding between 2 different systems different then there will be character conversion errors, like the one you're experiencing with your application right now.

If you scroll down in the article I gave above. At the bottom of the article it talks about database configuration.

If you want to preserve the single quote then you need to figure out what each layer (database, application) of your application is using and change the encoding to be uniform throughout.

appy77a at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 9
I admit I am over my head at this point, but thanks to your replies I hope to be up to speed soon. The encoding I'm getting from the text is CP1252. I tried converting to UTF-8, but instead of getting the odd square, I'm now getting a ? for the apostrophe.......help.Thanks.
FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 10

The rectangle means you have a character that the font can't render properly. Usually in this case you have a valid character. But the question mark means you have an encoding failure. So whatever you did to convert the data didn't work the way you wanted it to.

Would you like to post the code you have so far?

DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 11

Many thanks....here is my original code:

public static String ClobToString(Clob cl) throws IOException, SQLException

{

if (cl == null)

return "";

StringBuffer strOut = new StringBuffer();

String aux;

String aux2;

BufferedReader br = new BufferedReader(cl.getCharacterStream());

String ls =System.getProperty("line.separator");

while ((aux=br.readLine())!=null)

strOut.append(aux+ls);

return strOut.toString();

}

Here is the code with the encoding convertion:

public static String ClobToString(Clob cl) throws IOException, SQLException

{

if (cl == null)

return "";

StringBuffer strOut = new StringBuffer();

String aux;

String aux2;

BufferedReader br = new BufferedReader(cl.getCharacterStream());

String ls =System.getProperty("line.separator");

while ((aux=br.readLine())!=null)

strOut.append(aux+ls);

StringInputStream st = new StringInputStream(strOut.toString());

InputStreamReader inStreamReader=new InputStreamReader(st,"UTF-8");

BufferedReader bufReader=new BufferedReader(inStreamReader);

StringBuffer strOut2 = new StringBuffer();

while ((aux2=bufReader.readLine())!=null)

strOut2.append(aux2+ls);

return strOut2.toString();

}

I do thank you again for any help.

FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 12
You said the encoding of the text was CP1252. But when you read it, you tell the reader that the encoding is UTF-8. That's why you get encoding failures.
DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 13
How can I resolve this? Can you please provide a code snippet that would work with my existing code? Appreciate all you help.
FAT-BOYa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...
# 14

InputStreamReader inStreamReader=new InputStreamReader(st,"cp1252");

But that's only one step. If you're reading a file encoded in CP1252 then that doesn't mess up the data. As I recall there was also a database and a webservice and XML and some dot-net thing and a display in a GUI involved. Keep reading those tutorials, you need to make sure that none of the steps messes up the encoding. That's basically what the John O'Conner document is saying.

DrClapa at 2007-7-10 3:09:01 > top of Java-index,Java Essentials,Java Programming...