parsing href=".." with regular expression's
I need to get all the hyper links on a webpage, I use this code but it dont work 100%
public void parseLinks(){
String A = "([Hh][Rr][Ee][Ff]\\s*=\\s*\")";
String B = "(?!#|[Hh]ttp|[Mm]ailto|.cgi|.css)";
String C = "(.*)";
String D = "(\\s*\")";
String exp = A+B+C+D;
Pattern p = Pattern.compile(exp);
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(m.group());
}
}
where s is the string that being parsed (html dokument). It works kind of on regular links e.g:
<a name="qwerty" href="qwerty.html"> gives output href="qwerty.html"
but I want it to give the output qwerty.html . How can I do this? It doesn't works on links like:
<a name="qwerty" href="qwerty.html" class="link"> gives the output href="/aktuell/index.html" class="link".
How can I just get the path?
[927 byte] By [
geranm] at [2007-9-30 10:41:15]

If I got s = "<a name = \"asd\" href=\"asd.html\"" or s = <a href=\"asd.html\"href<\\a>" it works with group 3.
But it dont work with s = "<a href=\"/openpos/index.html\" class=\"link\"><b>Open positions</b></a>" ,
gives output:
/openpos/index.html" class="link
Any suggestion ?, it seems that i fails if there is more then one " before the >
exempel code...
public void parseLinks(){
String s = "<a name = \"asd\" href=\"asd.html\"";
/*String s = "<a href=\"/openpos/index.html\" class=\"link\"><b>Open positions</b></a>";*/
String A = "([Hh][Rr][Ee][Ff]\\s*=\\s*\")";
String B = "(?!#|[Hh]ttp|[Mm]ailto|[Ll]ocation.|[Jj]avascript|.cgi|.css)";
String C = "(.*)";
String D = "(\\s*\")";
String exp = A+B+C+D;
Pattern p = Pattern.compile(exp);
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(m.group(3));
}
}