Word predictor

Hi everyone

I am new to this forum so bear with me! Anyway, I am trying to create a program that works like this word predictor http://www.utoronto.ca/atrc/reference/staff/scheuhammer/WPDemo/index.html

I am thinking of storing an entire vocabulary as a text file separated by returns. But I think I might have to think about the probability of the predictions i.e. would I need to give them a value (which updates at runtime depending on the new words a user enters)

These are just ideas, so any tips on how I could get started would be greatly appreciated.

Many thanks

Sarah

[614 byte] By [sarahsmitha] at [2007-11-26 19:53:15]
# 1
Personally, I do everything with serialized ObjectsPro: you can put in and structure every combination of Java typeCon: you normally lose any pre serialized data, if you play around with your class.
agerard2a at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 2

Maybe you store the data in a text file between program executions, but you would want to get it out of the text file and into a useful data structure pretty much immediately.

You have words and you have frequency values, and frequency values between words. You're looking at a graph structure.

Are you generating the probabilities as the user uses the program, or are you storing that in the file(s) as well?

paulcwa at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 3

I suggest to use a Map<Character, String[]>

so, when people type the first letter of a word, you read the file, fill the String[] with the words that starts with the Character, and if the number of words are less than 10, show them, else, wait for the second Character and somehow keep doing this... That's the way I would go, but I haven't think really well the problem

PS: It might be other far better solutions, this is just the first thing that came out of my mind

Message was edited by:

Ruly-o_O

Ruly-o_Oa at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 4

> Are you generating the probabilities as the user uses

> the program, or are you storing that in the file(s)

> as well?

I don't know :) I'm just trying to get some ideas before starting to implement. I was thinking if it would be better to adjust probability at real-time because the user might use more of one particular word than another. Also the increase use of novel words should increase probability compared current words.

sarahsmitha at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 5
The data structure you most likely want is a Trie (or maybe not, didn't read the other posts very carefully).
-Kayaman-a at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 6
I would definately build a tree structure. Each node is a character, and some nodes may also be flaged as terminals.Kaj
kajbja at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 7
> I would definately build a tree structure. Each node> is a character, and some nodes may also be flaged as> terminals.A.k.a. a [url= http://en.wikipedia.org/wiki/Trie]Trie[/url]
-Kayaman-a at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 8

> > I would definately build a tree structure. Each

> node

> > is a character, and some nodes may also be flaged

> as

> > terminals.

>

> A.k.a. a

> [url=http://en.wikipedia.org/wiki/Trie]Trie[/url]

That is actually not what I thought of. It looks like the Trie stores more than one character per node.

kajbja at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 9
Only the end nodes, the intermediate ones are characters (or could be syllables or whatever).
-Kayaman-a at 2007-7-9 22:44:51 > top of Java-index,Java Essentials,Java Programming...
# 10
> Only the end nodes, the intermediate ones are> characters (or could be syllables or whatever).Ah, ok. I only looked at the picture to the right. :)
kajbja at 2007-7-9 22:44:52 > top of Java-index,Java Essentials,Java Programming...