Word predictor
Hi everyone
I am new to this forum so bear with me! Anyway, I am trying to create a program that works like this word predictor http://www.utoronto.ca/atrc/reference/staff/scheuhammer/WPDemo/index.html
I am thinking of storing an entire vocabulary as a text file separated by returns. But I think I might have to think about the probability of the predictions i.e. would I need to give them a value (which updates at runtime depending on the new words a user enters)
These are just ideas, so any tips on how I could get started would be greatly appreciated.
Many thanks
Sarah
Personally, I do everything with serialized ObjectsPro: you can put in and structure every combination of Java typeCon: you normally lose any pre serialized data, if you play around with your class.
Maybe you store the data in a text file between program executions, but you would want to get it out of the text file and into a useful data structure pretty much immediately.
You have words and you have frequency values, and frequency values between words. You're looking at a graph structure.
Are you generating the probabilities as the user uses the program, or are you storing that in the file(s) as well?
I suggest to use a Map<Character, String[]>
so, when people type the first letter of a word, you read the file, fill the String[] with the words that starts with the Character, and if the number of words are less than 10, show them, else, wait for the second Character and somehow keep doing this... That's the way I would go, but I haven't think really well the problem
PS: It might be other far better solutions, this is just the first thing that came out of my mind
Message was edited by:
Ruly-o_O
> Are you generating the probabilities as the user uses
> the program, or are you storing that in the file(s)
> as well?
I don't know :) I'm just trying to get some ideas before starting to implement. I was thinking if it would be better to adjust probability at real-time because the user might use more of one particular word than another. Also the increase use of novel words should increase probability compared current words.
> > I would definately build a tree structure. Each
> node
> > is a character, and some nodes may also be flaged
> as
> > terminals.
>
> A.k.a. a
> [url=http://en.wikipedia.org/wiki/Trie]Trie[/url]
That is actually not what I thought of. It looks like the Trie stores more than one character per node.
> Only the end nodes, the intermediate ones are> characters (or could be syllables or whatever).Ah, ok. I only looked at the picture to the right. :)