Parsing a string using StringTokenizer

Hi,

I want to parse a string such as

String input = ab{cd}:"abc""de"{

and the extracted tokens should be as follows

ab

{

cd

:

"abc""de"

{

As a result, I used the StringTokenizer class with deilmeter {,},:

StringTokenizer tokenizer = new StringTokenizer(input,"{}:", true);

In this was, I can separate the tokens and also can get the delimeters. The problem is I don't know how to parse the string that has double quote on it. If a single quote " is taken as a delimeter then

", abc, ",", de," all of them will be taken as a separate token. My intention is to get the whole string inside the double quote as a token including the quotes on it. Moreover, if there is any escape character "", it should be also included in the token. Help please.

Thanks

[848 byte] By [Nancy12a] at [2007-10-1 18:51:14]
# 1
once you find the first quote go backwards till you find the last quote. everything between the 2 is 1 token
Linkera at 2007-7-11 14:02:00 > top of Java-index,Other Topics,Algorithms...
# 2

You want something similar to the way Excel does CSV. I have libraries that read Excel CSV files:

http://ostermiller.org/utils/ExcelCSV.html

The problem is that my library only has one delimiter at a time rather than the three you want. The library is open source and you could use a similar Lexer to the one that the Excel CSV libraries use. I based that one on jflex: http://jflex.de/

thedeadseaa at 2007-7-11 14:02:00 > top of Java-index,Other Topics,Algorithms...
# 3

A bit of a "sticky tape"-solution...

import java.util.StringTokenizer;

public class Test {

public static void main(String[] args) {

String input = "ab{cd}:\"abc\"\"de\"";

StringTokenizer st = new StringTokenizer(input, "{}:", true);

while(st.hasMoreTokens()) {

String token = st.nextToken();

if(token.startsWith("\"") && token.endsWith("\"")) {

token = token.substring(1,token.length()-1);

}

System.out.println(token);

}

}

}

prometheuzza at 2007-7-11 14:02:00 > top of Java-index,Other Topics,Algorithms...