Keyword search in a web page
Can anyone tell how to search for a keyword in any web page in java...Suppose if
am giving query in a google page, The results will be displaying in many pages. In the 1st page i want to search for a keyword www. ...
Can anyone tell how to search for a keyword in any web page in java...Suppose if
am giving query in a google page, The results will be displaying in many pages. In the 1st page i want to search for a keyword www. ...
First, you need to download the webpage using a URL and URLConnection. Post your code when you have that done
> Can anyone tell how to search for a keyword in any
> web page in java...Suppose if
> am giving query in a google page, The results will be
> displaying in many pages. In the 1st page i want to
> search for a keyword www. ...
code
import java.io.*;
import java.net.*;
public class page
{
public static void main(String args[]) throws IOException
{
java.io.BufferedInputStream in = new java.io.BufferedInputStream(new
java.net.URL("http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N").openStream()
);
java.io.FileOutputStream fos = new java.io.FileOutputStream("testing1.htm");
java.io.BufferedOutputStream bout = new BufferedOutputStream(fos,1024);
byte data[] = new byte[1024];
while(in.read(data,0,1024)>=0)
{
bout.write(data);
}
bout.close();
in.close();
}
}
problem is:
C:\Program Files\Java\jdk1.5.0\bin>javac page.java
C:\Program Files\Java\jdk1.5.0\bin>java page
Exception in thread "main" java.io.IOException: Server returned HTTP response co
de: 403 for URL: http://www.google.co.in/search?q=Testing&hl=en&start=00&sa=N
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon
nection.java:1133)
at java.net.URL.openStream(URL.java:1007)
at page.main(page.java:8)
anyone send me code how to find a keyword stating from www. and ending with .doc(or .hmt/.pdf) in a text file and i should store the url in a temp string .For example in the text file if am having link like this means www.cdc.gov/hiv/testing.htm i want to extract and pass cdc.gov/hiv/
into my url string..........