Parsing a string using StringTokenizer
Hi,
I want to parse a string such as
String input = ab{cd}:"abc""de"{
and the extracted tokens should be as follows
ab
{
cd
:
"abc""de"
{
As a result, I used the StringTokenizer class with deilmeter {,},:
StringTokenizer tokenizer = new StringTokenizer(input,"{}:", true);
In this was, I can separate the tokens and also can get the delimeters. The problem is I don't know how to parse the string that has double quote on it. If a single quote " is taken as a delimeter then
", abc, ",", de," all of them will be taken as a separate token. My intention is to get the whole string inside the double quote as a token including the quotes on it. Moreover, if there is any escape character "", it should be also included in the token. Help please.
Thanks
[848 byte] By [
Nancy12a] at [2007-10-1 18:51:14]

You want something similar to the way Excel does CSV. I have libraries that read Excel CSV files:
http://ostermiller.org/utils/ExcelCSV.html
The problem is that my library only has one delimiter at a time rather than the three you want. The library is open source and you could use a similar Lexer to the one that the Excel CSV libraries use. I based that one on jflex: http://jflex.de/
A bit of a "sticky tape"-solution...
import java.util.StringTokenizer;
public class Test {
public static void main(String[] args) {
String input = "ab{cd}:\"abc\"\"de\"";
StringTokenizer st = new StringTokenizer(input, "{}:", true);
while(st.hasMoreTokens()) {
String token = st.nextToken();
if(token.startsWith("\"") && token.endsWith("\"")) {
token = token.substring(1,token.length()-1);
}
System.out.println(token);
}
}
}