Software Engineering/General Design question

I'm currently working on a system to access information from a morpholinguistic database (of any language) and use this information to inflect a single word (chosen by the user) in different forms. A user can choose a single word, and the word would be correctly inflected according to morphological rules, and the word with all endings returned to a screen for the user. The user then uses set of JavaScript menus to select a form and that word, along with any possible alternative forms of the word would be displayed (and notes would be given for when an alternative form is to be used). Note that I'm trying to design the program in such a way that the presentation layer can be anything, so the JavaScript thing was just an example.

To allow a greater degree of abstraction, the processing program must be able to return for rendering (note: I'm using the code tag here so taht the formatting is kept):

* A set of possible categories for the word (i.e. noun, adjective, etc.) for words that can be used in different ways (i.e. the English word "hand", which can be used as a verb or as a noun). The categories will be defined in a centralized way as to how many and what types of forms the user can choose from. By way of example, the VERB category in English would have definitions of "tense" ("present", "simple past", ...), "person" ("1st", "2nd", "3rd") and "number" ("singular", "plural")

* General notes about the word/additional information about the word, such as meaning, etc.

* A mapping of an arbitrary number of key/value pairs with a set of mappings of key/value pairs, for the words themselves. For example, here might be the set of mappings for the English word "hand".- (category="noun" number="singular") -> ([word="hand"])

- (category="noun" number="plural") - >([word="hands"])

- (category="verb" tense="present" person="1st" number="singular") -> ([word="hand" example="I hand"])

- (category="verb" tense="simple past" person="3rd" number="plural") -> ([word="handed" example="They handed"])

- ...

The set exists for possible alternatives of a word. This isn't so much a problem in English, but in Russian (the first language this system is being applied to), the map of key/value pairs that identify the form (the category, tense, etc. stuff) would actually map to a SET of other key/value pairs for each possible alternative.

[/code]

My question is, how would I represent all this?

Here's one, rather ugly idea I had. I could have a Word object that contains some information (possibly a string) that tells what category it is, and the category itself will be defined in an XML file/somewhere else on the presentation layer. Then it will have a string with general notes, and then something like this (there would obviously be wrapper classes around these things, I'm not actually going to expose it, but this is the general hierarchical idea): Map<Map><String, String>, Set<Map><String, String>>>

However, that solution seems somehow ungainly to me. Does anyone have any other ideas about how I could implement a system like this?

Thanks!

[3656 byte] By [fraser_of_the_nighta] at [2007-11-26 18:37:41]
# 1

If I understand correctly:

I guess my first thought is to create a Word object. This object would contain attribute about a Word and also contain Map or Set of "WordType" objects. (If it was a Map, the key would be the Word Type and the value a Set of alternative words).

WordType would contain attribute specifying the category, tense, etc. and would contain a Set of examples.(Maybe even make WordType an interface and have a Noun implementation, Verb implementation, etc.)

zadoka at 2007-7-9 6:11:46 > top of Java-index,Java Essentials,Java Programming...
# 2
More abstraction is better. A String is too high level. If you want behavior associated with words, by all means create a Word class.Think more abstractly. That's what objects are for. Hide those details in the class.%
duffymoa at 2007-7-9 6:11:46 > top of Java-index,Java Essentials,Java Programming...
# 3

Go for some kind of Word class as said before. I don't think you should start with hard coding things into maps for everything. Create a concept of grammar. Words are inflected by applying rules to them. Implement those rules for regular verbs and nouns. Let your program generate the result. This allows for a more flexible way of handling words. Of course you'll have to identify the exceptions to the general rule and come up with a solution for it.

Peetzorea at 2007-7-9 6:11:46 > top of Java-index,Java Essentials,Java Programming...
# 4
> More abstraction is better. A String is too high> level. If you want behavior associated with words,> by all means create a Word class.Should have read "A String is too low level".%
duffymoa at 2007-7-9 6:11:46 > top of Java-index,Java Essentials,Java Programming...