Keyword search in a web page

Can anyone tell how to search for a keyword in any web page in java...Suppose if

am giving query in a google page, The results will be displaying in many pages. In the 1st page i want to search for a keyword www. ...

[229 byte] By [abar_sowa] at [2007-11-27 11:57:50]
# 1

First, you need to download the webpage using a URL and URLConnection. Post your code when you have that done

> Can anyone tell how to search for a keyword in any

> web page in java...Suppose if

> am giving query in a google page, The results will be

> displaying in many pages. In the 1st page i want to

> search for a keyword www. ...

tjacobs01a at 2007-7-29 19:16:14 > top of Java-index,Java Essentials,Java Programming...
# 2

Apache Lucene: http://lucene.apache.org/

ParvatiDevia at 2007-7-29 19:16:14 > top of Java-index,Java Essentials,Java Programming...
# 3

code

import java.io.*;

import java.net.*;

public class page

{

public static void main(String args[]) throws IOException

{

java.io.BufferedInputStream in = new java.io.BufferedInputStream(new

java.net.URL("http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N").openStream()

);

java.io.FileOutputStream fos = new java.io.FileOutputStream("testing1.htm");

java.io.BufferedOutputStream bout = new BufferedOutputStream(fos,1024);

byte data[] = new byte[1024];

while(in.read(data,0,1024)>=0)

{

bout.write(data);

}

bout.close();

in.close();

}

}

problem is:

C:\Program Files\Java\jdk1.5.0\bin>javac page.java

C:\Program Files\Java\jdk1.5.0\bin>java page

Exception in thread "main" java.io.IOException: Server returned HTTP response co

de: 403 for URL: http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N

at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon

nection.java:1133)

at java.net.URL.openStream(URL.java:1007)

at page.main(page.java:8)

ravi.ramyaa at 2007-7-29 19:16:14 > top of Java-index,Java Essentials,Java Programming...
# 4

anyone send me code how to find a keyword stating from www. and ending with .doc(or .hmt/.pdf) in a text file and i should store the url in a temp string .For example in the text file if am having link like this means www.cdc.gov/hiv/testing.htm i want to extract and pass cdc.gov/hiv/

into my url string..........

ravi.ramyaa at 2007-7-29 19:16:14 > top of Java-index,Java Essentials,Java Programming...