Pattern

I am stuck. I am using regex which is completely new to me and i am trying to do the patterns to go through html code but i can't figure out how to make a pattern or add to the one i have that will delete or skip over any null words or &amp words this is what i have right now Pattern p = Pattern.compile("(?<=\\>)(?=\\null)[^<]+");

[375 byte] By [mark07a] at [2007-11-27 10:32:04]
# 1

What do you mean "skip over"? Could you give some example in- and output?

prometheuzza at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 2

sure

input =<strong><font size="2" color="#000080">July13 2007 9:00 PM</font></strong>

</td>

output = nullAug 28 2007 9:00 PMnullnullnullTotal Patent Releasenullnull

i cut out a lot of the input though... html code has lots of tags lol o and i just realise in my original compile code its "(?<=\\>)[^<]+" doesn't have the null in it

mark07a at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 3

> sure

>

> input =<strong><font size="2"

> color="#000080">July13 2007 9:00

> PM</font></strong>

</td>

> output = nullAug 28 2007 9:00 PMnullnullnull

>Total Patent Releasenullnull

Ok, that's the example input. Where's the example output?

> i cut out a lot of the input though... html code has

> lots of tags lol o and i just realise in my original

> compile code its "(?<=\\>)[^<]+" doesn't have the

> null in it

Do you know what that pattern does?

Also, have you considered using a html parser?

prometheuzza at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 4

output = null Aug 28 2007 9:00 PMnullnullnull

Total Patent Releasenullnull

right under the input and i have never used html stuff or regex before so i really have no clue what it does kinda of trial and error and help lol

mark07a at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 5

In what way is

<strong><font size="2" color="#000080">July13 2007 9:00 PM</font></strong>

</td>

related to

nullAug 28 2007 9:00 PMnullnullnullTotal Patent Releasenullnull

?

How did July get changed to August? Where did Total Patent come from?

hunter9000a at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 6

ok sorry i took the wrong index number so they don't match up.. the dates should be the same and the total patent is in the line i am matching but its WAY out there and i didn't want to copy paste the whole html line

mark07a at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...
# 7

> ok sorry i took the wrong index number so they don't

> match up.. the dates should be the same and the total

> patent is in the line i am matching but its WAY out

> there and i didn't want to copy paste the whole html

> line

String in = "<strong><font size=\"2\" color=\"#000080\">July13 2007 9:00 PM</font></strong>

</td>";

String out = in.replaceAll("\\<[^>]*\\>", "");

System.out.println(in+"\n\n"+out);

prometheuzza at 2007-7-28 18:14:20 > top of Java-index,Java Essentials,Java Programming...