How to detect the word is ascii or unicode?

If i have a textfield and i want to detect whether the word key in by user is ascii or unicode, what should i do? i have totally no idea how to detect this. Thanks for help.
[180 byte] By [LazyLearnera] at [2007-10-3 3:14:02]
# 1
In general, strings in java are UTF8, so it really does not matter if a string is just ASCII or unicode.
deepspacea at 2007-7-14 21:05:06 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 2

But the problem is, now i am trying to write a media center midlet which allow the user to view pictures. User can rename the file. What i concern here is if the phone for user A is support korean or chinese try to rename the file in chinese and save it in SD card. When it pass to user B with the phone not support korean or chinese, the file name cannot been display properly. What i want to do here is to limit the user only can rename the file with normal latin ascii code. Is there any way i can do like this way? Thanks for help!!

LazyLearnera at 2007-7-14 21:05:06 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 3
Okay.Well, since ascii is only seven bit's character with the eitth bit on will not be valid ASCII. So a simple bitmask will do all you need.
deepspacea at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 4
hmm.. i have never done anything related to bit before. Would you mind to give me a simple example?
LazyLearnera at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 5

public boolean isASCII(String s){

char[] data = s.toCharArray();

for(int i=0;i<data.length;i++){

if( (data[i]&0x80) == 0x80){

return false;

}

}

return true;

}

This should do the trick I guess...>

deepspacea at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 6
Thanks a lot !!! I think this will be the first step for me to research on bitmask. haha..
LazyLearnera at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 7
i tried the code, but found it didn't work properly. but if change the char[] into byte[], it worked fine.
baihaileia at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 8

i have a simply way to detect whether a text file is acsii or unicode.

just see the text file under binary way. u will see the difference in first several bytes.

gernally,if the text file is unicode form ,the start two bytes are:FF,FE.and if the text file is UTF8 form ,the start three bytes are:EF,BB,BF

but the acsii text hasn't extral information.

timothycna at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 9

He just wants to know if java Strings contain unicode characters, so it is useless to look al text files!

The getBytes method won't work correctly. It will give you UTF8 data, and if it contains unicode, you will read failty data.It might work in most situations though.

The char method should also work just fine... I can't see why it shouldn't... maybee some casting probelems...

deepspacea at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 10

I am in Shanghai , China, so i use Chinese character to test the method. when i use "我"(i hope you can read chinese character) , the method will return true, though if i use "们". the reason might be as follows:

in unicode, the two bytes representing chinese character do not always start with 1. while in UTF8, they do.

so here, if i use UTF8, it ok.

baihaileia at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 11
In general, strings in java are UTF8;so it is of course the string u get from textfield is utf8!!
timothycna at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 12
it is not. i tried two methods on my Moto E398. using char[], it's not working correctly.using byte[], it's ok.
baihaileia at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 13
> In general, strings in java are UTF8;> so it is of course the string u get from textfield is> utf8!!Well, yes and no... Yes, strings in java are genneraly UTF8, but in memory they are all 2 byte unicode, and getChars will give you a unicode chararray.
deepspacea at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 14
I tested the code in my midlet. It pass even though i use hebrew and arabic word to rename the file. Is there any where goes wrong?
LazyLearnera at 2007-7-14 21:05:07 > top of Java-index,Java Mobility Forums,Java ME Technologies...
# 15

Hmm, could you post the int values of any of these characters?

char unicodethingie = '...'; // put in some hebrew or arabic character

System.out.println("char: " +(int) unicodethingie;

deepspacea at 2007-7-21 10:07:18 > top of Java-index,Java Mobility Forums,Java ME Technologies...