reading UTF-8
Basically i need to read a UTF-8 formatted text file
BufferedReader in =new BufferedReader(new FileReader(args[0]));
String str;
while ((str = in.readLine()) !=null){
String UTF8Str =new String(str.getBytes(),"UTF-8");
System.out.println("read line: " + UTF8Str);
if (num == 0){
if (UTF8Str.equalsIgnoreCase("true")){
drawLine =true;
}
if (UTF8Str.equalsIgnoreCase("false")){
drawLine =false;
}
}else{
strings.add(UTF8Str);
}
num++;
}
in.close();
The problem is that when i read the first line it always reads some extra character (some stuffing i guess), so that the first line reads "?true" instead of "true". When i open the text file with a Hex Editor the first chars look like this:
EF BB BF ....
What am I doing wrong?
[1591 byte] By [
eversora] at [2007-10-3 2:47:18]

It appears that one of the constructors for InputStreamReader (of which FileReader is a subclass) takes an argument of type java.nio.charset.Charset that specifies the character encoding of the stream, which is expected since a reader translates a byte stream to a character stream. If you don't correctly tell the InputStreamReader about your file, it will decode the byte stream using the system default character encoding.
Brian
What you're doing is reading the file using the wrong encoding and then trying to patch it up. Don't do that. Just read the file with the right encoding to start with:BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(args[0]), "UTF-8");
And leave out this:String UTF8Str = new String(str.getBytes(),"UTF-8");
It isn't necessary. (And the variable name is ugly because it implies that it refers to a "UTF-8 String", which there is no such thing.)