Problem with unicode converting from "#x...." to "\u..."
Hi,
I'm working on some unicode decoding stuff. The origianl data I have is in text format and the unicode started with "#x", for example, "#x3008" means "<". I used the following code to replace the "#x" with "\u". However, after the replacement, the "\u3008" will just appear as "\u3008", not being decoded into "<" in java.
Could anyone help me on this?
Thanks a lot!
The test.java I used is as below:
import java.io.*;
import java.text.ParseException;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class test {
public static void main(String argv[]){
String rawText="CEG1 \u30086J\u3009 cells #x3008";
//Tabby #x30086J#x3009 (lane 4); 5#x20137
Pattern p = Pattern.compile("#x([0-9A-Fa-f]{4})");
Matcher m = p.matcher(rawText);
String newtext=m.replaceAll("\\\\u$1");
;
System.out.println(rawText);
System.out.println(newtext);
}
}

