Reading text on a screen
Hey everyone. I'm working on a program that will read the text displayed in another program. When a certain keyword appears in the other program's text, my program will log it. The programs would run on a Windows system.
Unfortuantely I'm not sure how to go about this. My guess is I would have to do some low-level programming and "listen in" to all the draw messages going through the system. In many ways this would resemble the text-to-speech programs used by the blind. However I don't know how to do this. Does anyone have any ideas?
I appreciate any help you all can give me.
OCR isn't the easiest thing in the world. You'll probably want to use some sort of third part application out there.
does the other program output its text to a console window or to a gui?if its a gui, you are ******. if its a console, you can redirect its stdout
It's a gui...The application receives its text as a constant stream from a different computer. Would it be easier for me to somehow intercept and decode the packets, and then search the text in it?
yesbut i think you should give up anyway, unless you are getting paid to do this
I am getting paid to do this :).
> but i think you should give up I love the optimism!
Why not just use one of the existing spyware programs? There's plenty of code out there that can do this sort of thing.
you can listen on the same port as the program, but then the program will not work very well. also you will have a difficult time figuring out what is inside of those packets since it probably wont be just plain text, but encoded in some weird way across multiple packets
Ok, so maybe I could take a "screenshot" of the screen and develop an algorithm to search the pixels for a specified series of characters?
The letters all appear in a certain color and take up the same amount of space, so that would make things easier...
I don't know if that would run quickly enough. Particularly since a few new lines will be appearing every second or so.
im curious, what is it that you are doing that you need to do this?
and why would someone pay you to do it.
if the characters are monospaced, you should be able to divide the screenshot into a grid where in each square a character may appear, after that matching can be fast if its done on multiple threads...
I believe it is being used to automatically scan news headlines that appear in a particular program.I'll try the grid division. Hopefully it will be fast enough.
Are you able, by hand, to select the text with the mouse and copy it to the clipboard?
If so, you could you the java.awt.Robot class to move the mouse to a suitable point on the screen then select and copy the text. From there you simply pull the text out of the clipboard and process it.
taking a screen capture and setting it to a grid seems like a terrible idea.
you can use jpcap to capture packets in java (There are to projects with this same name, that do the same thing, take your pick).
That will work if text your are searching for is transmitted plain-text. Another thing I would do is try to find out where this text is coming from and if there is an api to hook yourself up to it.
This wouldn't be an attempt to detect market news would it?
I don't think I can select the text.I don't know if it's being used to scan market news. Would that be illegal or something?I'll look into jpcap, but the whole decoding thing seems like it would take longer than the grid system anyway.
> OCR isn't the easiest thing in the world. You'll> probably want to use some sort of third part> application out there.Why don't you just write it for the OP you douche that walks.
> > OCR isn't the easiest thing in the world. You'll
> > probably want to use some sort of third part
> > application out there.
>
> Why don't you just write it for the OP you douche
> that walks.
LOL digging up old comments....haha...you really do ahve no life.
But here's one that's already written:
http://www.javaocr.com/
> The programs would run on a Windows system.
>
> My guess is I would have to do some low-level
> programming
Those two sentences in the original post lead to the question:
why Java?
- Packet sniffing cannot be done in pure Java.
- Java may be slow for on-the-fly character recognition.
- It is difficult to interract in Java with the OS and native applications.
Even if Java will be used, it may become an unncessary spaghetti of
Java and native code.
uh jpcap is probably one of the easiest third party libraries to use in java, and it sniffs network traffic.
But, Im thinking the OP gave up the really terrible idea of taking a screen capture and trying to match patterns in pixels to find words(!!!). That is almost scary that someone would even suggest that.
I don't know why he is all bent about having to "decode" the packets.