Re-arranging an array of words
Hi, I have a large array of over 5000 words. I need to take all these words and re list them by groups--e.g. have words like 'fire', 'flame', 'burn' and 'combustible' listed next to each other.
Is there a thesaurus I can use to perform this operation? I would like to be able to use the thesaurus in SQL Server 2005. Is this possible? Also, what algorithm should I use to equally cluster similar words?
I appreciate any help or comments that are given--thanks in advance!
[498 byte] By [
MrPeanuta] at [2007-10-3 1:14:08]

Have you heard about WordNet? Its a lexical database for the english language. WordNet organizes all of the words in a tree like structure, with synonym sets being hierarchically linked according to their lexical concept. http://wordnet.princeton.edu/
In terms of SQL server 2005, I'm not sure... you may have to write an interface that interacts with the WordNet database.
You may also want to check out the Roget's thesaurus. I believe it would be available from the Gutenberg project: http://www.gutenberg.com.
hope that helps
sergey