Natural Language Processing Using Java
Hello Friends,
Im just a new comer with Natural Language Processing.
I want to actually know the process or the Java Language Packages and features that can be implemented in Natural Language Processing.
I would like to know if theres any particular alogorithm which i can implement in java to achieve this ...
This is what i wanted to do..
Like I need to a design a system which can take in queries in human's way of speech and return the appropriate nearest result...
For example if i type "Who is James Gosling?" It must return me the nearest result for the query by analysing the free text input.
Sound similar like a search engine but here im trying to fetch it from some predefined values in some database...
Like the processing happens something like this...The system needs to go from root,Here root is "Who is" then it needs to search for James Gosling in database and then if theres any data it has fetch it or it needs to fetch nearest value...
Something like devloping an intelligent system capable of answering queries asked in human ways...
How do i do this in Java or any algorithm?
Thanks in advance.
Note:
I alreadty posted the same thing under "Java Programming" thinking that it comes under that divison,From there a buddy redirected me here...
Hope this is not considered as a cross-post it was actually a mis-post
Regds,
Gokul
Natural Language Processing involves a simple lexical analysis that uses a
dictionary of words. The following 'sentence' is not a valid English sentence:
(*^%^%IGHJHJHJ^HG JHJGY&*&kl lO*(*( )(
... because non of the character sequences make up a valid word. On top of this
lexical analysis a simple context free grammar is used to 'unravel' a sentence,
or to show its 'depth' structure. The following 'sentence' is not a valid
English sentence:
burger do do do it shrubbery no being at it it refridgerator it's.
... because every single word exists in the English language but the depth
structure doesn't make sense grammatically.
On top of this simple dictionary based lexical analysis, a number-agreement and
other conjugation and inflection analysis must be performed. This analysis can
be implemented using a more elaborate dictionary during the parsing phase.
The following 'sentence' is not a syntactically correct English sentence:
the refrigerators eaten one men
... because there is no number agreement (singular/plural) between the noun
phrase and the verb. The verb is stated the wrong conjugation also.
On top of this syntactical-plus analysis, if you want to be able to handle
contractions, e.g. 'wanna' for 'want to', you have to augment your language
grammar. e.g. the following shows the existence of so called 'epsilon words'
1*) Who do you wanna wash the car? I want Bill to wash the car.
2) Who do you want to wash the car? I want Bill to wash the car.
The incorrectness of sentence 1*) can only be shown if you're able to substitute
the answer 'Bill' for the empty word <eps>:
Who do you want <eps> to wash the car?
The previous examples all assumed quite a large dictionary, storing information
about nouns, verbs, adjectives etc. and attributes of them: transitive, reflexive,
inherently-singular/plural etc. etc.
If things aren't complicated enough already, semantics or the meaning of words and
sentences hits. The following sentence is a perfect English sentence as defined
above:
A small large angry refridgerator dances on my father's womb before he was born.
For several reasons (to us humans), the above sentence doesn't make sense, IOW the
semantics of it are wrong (don't make sense at all).
For all the (negative) reasons mentioned above, I'd strongly suggest that you reduce
the natural language understanding of your to be system, to a highly coherent, quite
limited domain.
For a nice natural language parser, have a look at Daniel Sleator's [url=http://www.link.cs.cmu.edu/link/]link parser[/url].
kind regards,
Jos
JosAHa at 2007-7-16 13:03:04 >
