Regular expression
Hi all,
I need to generate a valid match for a given regular expression, for example: if I have the following regular expression: "[a-z][0-9][a-z][a-z]" I would like my program generate any valid regular expression. Does anyone know a java API that could help me?
Thanks,
Paulo
hi!
you should read those
java.sun.com/docs/books/tutorial/essential/regex/
java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.htm
www.javaregex.com/
and i'll paste some code for you to see an example where i read from a file
import java.io.*;
import java.util.regex.*;
import java.util.*;
import java.lang.*;
public class find_FILE_table_rows
{
public find_FILE_table_rows() throws Exception
{
try
{
String linefeed;
File fin = new File("records.txt");
FileInputStream fis = new FileInputStream(fin);
BufferedReader br = new BufferedReader(
new InputStreamReader(fis));
BufferedWriter out = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(new File("table_FILE.txt"))));
StringBuffer sb = new StringBuffer();
Hashtable htable = new Hashtable();
String temp_line,tofile;
Pattern p = Pattern.compile("^FN\\s.*$");
Pattern p1 = Pattern.compile("^H5.*$");
Pattern p2 = Pattern.compile("^H6.*$");
Pattern p3 = Pattern.compile("^H7.*$");
htable.put("FN", "NULL");
htable.put("H5", "NULL");
htable.put("H6", "NULL");
htable.put("H7", "NULL");
htable.put("CH5", "NULL");
htable.put("CH6", "NULL");
htable.put("CH7", "NULL");
Matcher m;
int i=0;
while( (linefeed =br.readLine()) != null )
{
m = p.matcher(linefeed);
//Pattern.matches("FN field");
if(m.matches())
{
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
htable.remove("FN");
htable.put("FN",tofile);
/*
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
out.write(tofile);
out.write(" ");
*/
}
m = p1.matcher(linefeed);
//Pattern.matches("H5");
if(m.matches())
{
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
htable.remove("H5");
htable.put("H5",tofile);
htable.remove("CH5");
htable.put("CH5",tofile);
/*
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
out.write(tofile);
out.write(" ");
*/
}
m = p2.matcher(linefeed);
//Pattern.matches("a*b", "aaaaab");
if(m.matches())
{
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
//my code
/*
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
out.write(tofile);
out.write(" ");
*/
}
m = p3.matcher(linefeed);
//Pattern.matches("a*b", "aaaaab");
if(m.matches())
{
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
//my code....
/*
temp_line = m.group();
tofile = temp_line.replaceFirst("^..\\s","");
out.write(tofile);
out.write(" ");
*/
}
}//EOF
String t = printHash(htable);
out.write(t);
out.close();
}
catch(FileNotFoundException fnfe) {System.out.println(fnfe);}
catch(IOException ioe) {System.out.println(ioe);}
catch(Exception e) {System.out.println(e);}
}
public static String printHash(Hashtable htable){
//............my code
}
}
now ...
To try to find smnthng you have to put it in a Pattern.
And try to match it with a Matcher.
for example
Pattern p = Pattern.compile("^[a-z]");
m = p1.matcher(linefeed)
matches the first letter of a line
Pattern p = Pattern.compile("^[a-z].*$");
mathes all lines that include 'lowerCase' letters
and so on...
Happy programming
REMEMBER READ THOSE WEB PAGES and import java.util.regex.*;
>
> and i'll paste some code for you to see an example
> where i read from a file
'georous',
You seem to have misunderstood the original post. The OP is not asking how to use regex, he is asking how to create a String that will match a given regex.
This requirement has been posted previously on this forum but I seem to remember that no solution was found. It looks to be a very difficult problem.
Sabre
> This requirement has been posted previously on this forum but I
> seem to remember that no solution was found. It looks to be a very
> difficult problem.
I isn't a difficult problem, it's just extremely tedious, especially if you
can't get your hands on the internal DFA (Deterministic Finite Automaton)
generated by the regex compiler. Such a DFA is just finite directed graph
with one or more 'accepting' states/vertices.
Finding all strings that would be accepted by the regex is equivalent
to finding all paths in the DFA that lead to an accepting vertex. Note that
there might be an infinit number of such paths for a given graph,
(eg. "a*"). If you don't limit the path/string length, the process would
never end.
IMHO for the given regex: "[a-z][0-9][a-z][a-z]" it would be much simpler
to explicitly write a couple of loops that'll generate the accepted strings.
(there'd be 26^3 * 10 of them).
kind regards,
Jos
> > This requirement has been posted previously on this
> forum but I
> > seem to remember that no solution was found. It
> looks to be a very
> > difficult problem.
>
> I isn't a difficult problem, it's just extremely
> tedious, especially if you
> can't get your hands on the internal DFA
> (Deterministic Finite Automaton)
> generated by the regex compiler. Such a DFA is just
> finite directed graph
> with one or more 'accepting' states/vertices.
:-)
Jos,
Looks like you and I have a different concept of what 'difficult' means!
Sabre
> Looks like you and I have a different concept of what
> 'difficult' means!
Nah, one of the really, really difficult things is: heating up egg rolls. I was
"home alone" last Friday and attempted to make me some eggrolls.
I didn't notice some plastic wrapping when I dumped those egg rolls
in the pan. The mess started stinking, got stuck to the pan, I took the pan
off and attempted to clean the mess. I scrubbed so hard I broke off the
handle and afterwards I noticed that I had forgotten to put back a big
box of chocolate ice cream in the fridge again after I had taken out those
darn egg rolls.
Now *that* was difficult, especially when I had to explain that broken pan
and mess in the kitchen to my wife. Now beat that.
Recursively generating a bunch of paths in a DFA (graph) is so easy
compared to those complicated kitchen activities ;-)
kind regards,
Jos
I think the OP only wants one match, not all of them. (But I could be wrong.) Does that make the problem easier?
> > Looks like you and I have a different concept of
> what
> > 'difficult' means!
>
> Nah, one of the really, really difficult things is:
> heating up egg rolls. I was
> "home alone" last Friday and attempted to make me
> some eggrolls.
> I didn't notice some plastic wrapping when I dumped
> those egg rolls
> in the pan. The mess started stinking, got stuck to
> the pan, I took the pan
> off and attempted to clean the mess. I scrubbed so
> hard I broke off the
> handle and afterwards I noticed that I had forgotten
> to put back a big
> box of chocolate ice cream in the fridge again after
> I had taken out those
> darn egg rolls.
>
> Now *that* was difficult, especially when I had to
> explain that broken pan
> and mess in the kitchen to my wife. Now beat that.
>
> Recursively generating a bunch of paths in a DFA
> (graph) is so easy
> compared to those complicated kitchen activities ;-)
Hi Jos,
I hope you got all this on video Jos; it would be worth a few bucks. As I said Jos, it looks like you and I have a different concept of what 'difficult' means!
Sabre
> I think the OP only wants one match, not all of them.> (But I could be wrong.) Does that make the problem> easier?On re-reading the original post, I think you are probably right. I assumed that the OP needed to exercise all logical paths in the regex.
Hello,
Thanks for all answers.
I need only one match, actually what I need is to generate one string with boundary values that matches the regular expression, example:
[a-z][0-9][a-z][a-z] -> z9zz
or
a*b -> aaaaaaaaaaaaaaaaab (where the number of 慳?is predefined maximum length).
or
[a-z]* -> zzzzzzzzzzzzzzzzzzzzzz
Does anybody know some API that I could include a listener to the Deterministic Finite Automaton? or do you think it could be easer if I parser by myself the regex?
Regards,
Paulo