parsing text file
I have a text file (and for now it is not going to be in xml format) which looks like:
System
{
Project
{
Instance
{
domain1
{
Server
{
AppServerHost ="host"
AppServerPort = 80
NetricsHost ="host"
NetricsPort = 5051
JassFileName ="jaas_client.conf"
}
Repository
{
# possible and supported values: shared | local
Mode ="shared"
}
Transport
{
# possible and supported values: http | srv
Protocol ="srv"
}
UserInterface
{
# possible and supported values: include | exclude
SideBarContent ="include"
}
}
domain2
{...repeat
I need to parse the file to create a tree structure that will be presented to a user...and the user can then add his own domain, server, transport etc etc...and those modifications will be reflected into the file.
I'm not sure how to go about parsing this since it seems quite tedious in a non-xml format. I don't know much about grammar...but i think there should be a grammar for the format i have that would make it easy to use some tool for parsing.
Can anybody suggest where i should start?...and even provide some hint on what the grammar should look like?
Thanks.
[1846 byte] By [
mehuld121a] at [2007-10-2 7:46:34]

The suggestions to use XML are interesting, particularly in the light of the fact that the OP indicated that XML is not a choice. What are we to do with data that is not in XML format to begin with? Throw it away? Give up? Write a translator?
And what did the OP ask for but help in writing a parser which would be the first step in building a translator?
IF the data is really in the format described in the original post, where line breaks are used in a very significant way - having exactly one item of interest on each line - your data is VERY easy to parse.
You displayed exactly 6 types of data lines:
1) node name
2) {
3) }
4) completely blank line
5) someAttributeIdentifier = someRandomString
6) # someRandomString
You parse it something like this:
static class SubNode {
String name;
SubNode(String n){name = n;}
}
static class Attribute extends SubNode{
String value;
Attribute(String n, String v){super(n);value = v;}
void write(int indent){writeln(spaces(indent) + name + " = " + value);}
}
static class Node extends SubNode{
ArrayList subNodes = new ArrayList();
Node(String n){super(n);}
void add(SubNode sn){subNodes.add(sn);}
void write(int indent){ // writes out onto some already opened thing w/o exception
writeln(spaces(indent)+name);
writeln(spaces(indent) + "{");
indent += 2;
for(int i = 0; i<subNodes.size(); i++) ((SubNode) subNodes.get(i))write(indent);
indent -= 2;
writeln(spaces(indent) + "}");
}
}
static Node read(){
// assumes some file was opened allowing nextLine() to read without exception.
// nextLine() should trim leading and trailing whitespace, skip blank lines, and return "" on EOF
ArrayList stackOfNodes = new ArrayList();
Node curNode = null;
Node rootNode = null;
int iSplit;
String line = nextLine();
while(!line.equals("")){
char c = line.charAt(0); // legal because line is not ""
if(c == '#'){
// do nothing or do something here - was comment
} else if(c == '}'){
stackOfNodes.remove(stackOfNodes.size()-1); // pop node - end of block
curNode = (Node) stackOfNodes.get(stackOfNodes.size()-1);
} else if(c == '{') {
// do nothing - was start of new block
} else if (iSplit = line.indexOf('=')){ // is an attribute
String atName = line.substring(0,iSplit).trim();
String atVal = line.substring(iSplit+1).trim();
curNode.add(new Attribute(atName,atVal));
} else { // (case 1) start of new node
Node newNode = new Node(line);
if(curNode == null) rootNode = curNode; else curNode.add(newNode);
curNode = newNode;
stackOfNodes.add(curNode)
}
line = nextLine();
}
return rootNode;
}
// note: like all the best language tools, this routine will conveniently fail - generating
// null pointer exceptions and the like if used on improper input. Caveat Emptor!
BEWARE - this was just typed in off the top of my head. It is neither complete, nor tested, and probably not even real java code, but it should be close enough to give you an idea of how to move forward.>
Following parser does only assume the recursive pattern
'[node name] { ...}' which makes the parser more flexible.
private String _left = "{";
private String _right = "}";
public Node parse(String info) {
// assuming syntax '[node name] { ... }'
int s0 = info.indexOf(_left);
int s1 = info.lastIndexOf(_right);
String name = info.substring(0, s0).trim();
String subInfo = info.substring(s0+_left.length(), s1);
// getting value or creating children
int p0 = 0;
int p1 = subInfo.indexOf(_left);
if (p1<0) {
return new Node(name, subInfo, null);
} else {
int p2 = nextEven(subInfo, p1);
List childNodes = new Vector();
while (p0<p1 && p1><p2) {
childNodes
.add(parse(subInfo.substring(p0,p2+_right.length())));
p0 = p2+_right.length();
p1 = subInfo.indexOf(_left, p0);
if (-1><p1) {
p2 = nextEven(subInfo, p1);
}
}
return new Node(name, null, childNodes);
}
}
private int nextEven(String core, int p0) {
int count = 1;
int p1, p2;
while (count>0) {
p1 = core.indexOf(_left, p0+_left.length());
p2 = core.indexOf(_right, p0+_left.length());
if (p2<0) {
throw new RuntimeException("Wrong input.");
}
if (p0<p1 && p1><p2) {
count++;
p0 = p1;
} else {
count--;
p0 = p2;
}
}
return p0;
}
public class Node {
public String _name = null;
public String _value = null;
public List _children = null;
public Node(String name, String value, List children) {
_name = name;
_value = value;
_children = children;
}
}
>
parza at 2007-7-16 21:32:43 >
