Help with encapsulation and a specific case of design

Hello all. I have been playing with Java (my first real language and first OOP language) for a couple months now. Right now I am trying to write my first real application, but I want to design it right and I am smashing my head against the wall with my data structure, specifically with encapsulation.

I go into detail about my app below, but it gets long so for those who don't want to read that far, let me just put these two questions up front:

1) How do principles of encapsulation change when members are complex objects rather than primitives? If the member objects themselves have only primitive members and show good encapsulation, does it make sense to pass a reference to them? Or does good encapsulation demand that I deep-clone all the way to the bottom of my data structure and pass only cloned objects through my top level accessors? Does the analysis change when the structure gets three or four levels deep? Don't DOM structures built of walkable nodes violate basic principles of encapsulation?

2) "Encapsulation" is sometimes used to mean no public members, othertimes to mean no public members AND no setter methods. The reasons for the first are obvious, but why go to the extreme of the latter? More importantly HOW do you go to the extreme of the latter? Would an "updatePrices" method that updates encapsulated member prices based on calculations, taking a single argument of say the time of year be considered a "setter" method that violates the stricter vision of encapsulation?

Even help with just those two questions would be great. For the masochistic, on to my app... The present code is at

http://www.immortalcoil.org/drake/Code.zip

The most basic form of the application is statistics driven flash card software for Japanese Kanji (Chinese characters). For those who do not know, these are ideographic characters that represent concepts rather than sounds. There are a few thousand. In abstract terms, my data structure needs to represent the following.

- There are a bunch of kanji.

Each kanji is defined by:

- a single character (the kanji itself); and

- multiple readings which fall into two categories of"on" and"kun".

Each reading is defined by:

- A string of hiragana or katakana (Japanese phoenetic characters); and

- Statistics that I keep to represent knowledge of that reading/kanji pair.

Ideally the structure should be extensible. Later I might want to add statistics associated with the character itself rather than individual readings, for example. Right now I am thinking of building a data structure like so:

- A Vector that holds:

- custom KanjiEntry objects that each hold

- a kanji in a primitivechar value; and

- two (on, kun) arrays or Vectors of custom Reading objects that hold

- the reading in a String; and

- statistics of some sort, probably in primitive values

First of all, is this approach sensible in the rough outlines?

Now, I need to be able to do the obvious things... save to and load from file, generate tables and views, and edit values. The quesiton of editting values raises the questions I identified above as (1) and (2). Say I want to pull up a reading, quiz the user on it, and update its statistics based on whether the user got it right or wrong. I could do all this through the KanjiEntry object with a setter method that takes a zillion arguments like:

theKanjiEntry.setStatistic(

"on",// which set of readings

2,// which element in that array or Vector

"score",// which statistic

98);// the value

Or I could pass a clone of the Reading object out, work with that, then tell the KanjiEntry to replace the original with my modified clone.

My instincts balk at the first approach, and a little at the second. Doesn't it make more sense to work with a reference to the Reading object? Or is that bad encapsulation?

A second point. When running flash cards, I do not care about the subtlties of the structure, like whether a reading is an on or a kun (although this is important when browsing a table representing the entire structure). All I really care about is kanij/reading pairings. And I should be able to quickly poll the Reading objects to see which ones need quizzing the most, based on their statistics. I was thinking of making a nice neat Hashtable with the keys being the kanji characters in Strings (not the KanjiEntry objects) and the values being the Reading objects. The result would be two indeces to the Reading objects... the basic structure and my ad hoc hashtable for runninq quizzes. Then I would just make sure that they stay in sync in terms of the higher level structure (like if a whole new KanjiEntry got added). Is this bad form, or even downright dangerous?

Apart from good form, the other consideration bouncing around in my head is that if I get all crazy with deep cloning and filling the bottom level guys with instance methods then this puppy is going to get bloated or lag when there are several thousand kanji in memory at once.

Any help would be appreciated.

Drake

[5447 byte] By [Drake_Duna] at [2007-10-1 23:30:03]
# 1

> 1) How do principles of encapsulation change when

> members are complex objects rather than primitives?

> If the member objects themselves have only primitive

> e members and show good encapsulation, does it make

> sense to pass a reference to them? Or does good

> encapsulation demand that I deep-clone all the way to

> the bottom of my data structure and pass only cloned

> objects through my top level accessors? Does the

> analysis change when the structure gets three or four

> levels deep? Don't DOM structures built of walkable

> nodes violate basic principles of encapsulation?

Huh? Encapsulation simply means: no access to private parts. I don't see why there should be a difference between primitive-type attributes and reference-type attributes.

> 2) "Encapsulation" is sometimes used to mean no

> public members, othertimes to mean no public members

> AND no setter methods. The reasons for the first are

> obvious, but why go to the extreme of the latter?

It's not "other times", it's always "no public members, no setters/getters". Because when simply writing a one-line setter/getter, you might as well have a public attribute and your encapsulation is nada.

> More importantly HOW do you go to the extreme of the

> e latter?

Usually by better design. Move methods that use the getters inside the class that actually has the data.

class WileysWonderfulShoppingCart {

// has list of items

int calculateVAT() {

// traverse item list

total += item.getPrice() * (1 + VAT); // it's early morning so I don't care for decimals

// ...

return total;

}

}

class Item {

private int price;

int getPrice() {

return this.price();

}

}

to

class WileysWonderfulShoppingCart {

// has list of items

int calculateVAT() {

// traverse item list

total += item.getVAT()

// ...

return total;

}

}

class Item {

private int price;

// ...

int getVat() {

return this.price() * (1 + VAT);

}

}

No getters. Advantage: better encapsulation and modularisation - in some countries, certain articles have a reduced VAT. Solution one couldn't handle that, at least not without extracting even further info from the item. But since the item already has all the info it needs, why shouldn't it decide on its VAT itself?

As a basic rule of thumb:

The one who has the data is the one using it. If another class needs that data, wonder what for and consider moving that operation away from that class. Or move from pull to push: instead of A getting something from B, have B give it to A as a method call argument.

> Would an "updatePrices" method that

> updates encapsulated member prices based on

> calculations, taking a single argument of say the

> time of year be considered a "setter" method that

> violates the stricter vision of encapsulation?

It's not really a setter. Outsiders are not setting the items price - it's rather updating its own price given an argument. This is exactly how it should be, see my above point. A breach of encapsulation would be: another object gets the item price, re-calculates it using a date it knows, and sets the price again. You can see yourself that pushing the date into the item's method is much beter than breaching encapsulation and getting and setting the price.

CeciNEstPasUnProgrammeura at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...
# 2

> 1) How do principles of encapsulation change when

> members are complex objects rather than primitives?

> If the member objects themselves have only primitive

> e members and show good encapsulation, does it make

> sense to pass a reference to them? Or does good

> encapsulation demand that I deep-clone all the way to

> the bottom of my data structure and pass only cloned

> objects through my top level accessors? Does the

> analysis change when the structure gets three or four

> levels deep? Don't DOM structures built of walkable

> nodes violate basic principles of encapsulation?

Huh? Encapsulation simply means: no access to private parts. I don't see why there should be a difference between primitive-type attributes and reference-type attributes.

> 2) "Encapsulation" is sometimes used to mean no

> public members, othertimes to mean no public members

> AND no setter methods. The reasons for the first are

> obvious, but why go to the extreme of the latter?

It's not "other times", it's always "no public members, no setters/getters". Because when simply writing a one-line setter/getter, you might as well have a public attribute and your encapsulation is nada.

> More importantly HOW do you go to the extreme of the

> e latter?

Usually by better design. Move methods that use the getters inside the class that actually has the data.

class WileysWonderfulShoppingCart {

// has list of items

int calculateVAT() {

// traverse item list

total += item.getPrice() * (1 + VAT); // it's early morning so I don't care for decimals

// ...

return total;

}

}

class Item {

private int price;

int getPrice() {

return this.price();

}

}

to

class WileysWonderfulShoppingCart {

// has list of items

int calculateVAT() {

// traverse item list

total += item.getVAT()

// ...

return total;

}

}

class Item {

private int price;

// ...

int getVat() {

return this.price() * (1 + VAT);

}

}

No getters. Advantage: better encapsulation and modularisation - in some countries, certain articles have a reduced VAT. Solution one couldn't handle that, at least not without extracting even further info from the item. But since the item already has all the info it needs, why shouldn't it decide on its VAT itself?

As a basic rule of thumb:

The one who has the data is the one using it. If another class needs that data, wonder what for and consider moving that operation away from that class. Or move from pull to push: instead of A getting something from B, have B give it to A as a method call argument.

> Would an "updatePrices" method that

> updates encapsulated member prices based on

> calculations, taking a single argument of say the

> time of year be considered a "setter" method that

> violates the stricter vision of encapsulation?

It's not really a setter. Outsiders are not setting the items price - it's rather updating its own price given an argument. This is exactly how it should be, see my above point. A breach of encapsulation would be: another object gets the item price, re-calculates it using a date it knows, and sets the price again. You can see yourself that pushing the date into the item's method is much beter than breaching encapsulation and getting and setting the price.

CeciNEstPasUnProgrammeura at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...
# 3
Double-oops.Sorry for posting twice, and sorry for that design mishap of using VAT as a constant inside item while at the same time probagating that can change depending on some other info. But I think you still get my idea.
CeciNEstPasUnProgrammeura at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...
# 4

I think the key tends to be if the member is a property of the object or part of the implementation of the object. If its a property then you should have no issue handing out an unmodifiable version of it. If its part of the implementation or makeup of the object, then you do not pass it around.

_dnoyeBa at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...
# 5

> Usually by better design. Move methods that use the

> getters inside the class that actually has the data.

...

> As a basic rule of thumb:

> The one who has the data is the one using it. If

> another class needs that data, wonder what for and

> consider moving that operation away from that class.

> Or move from pull to push: instead of A getting

> something from B, have B give it to A as a method

> call argument.

Thanks for the response. I think I see what you are saying.. in my case it is something like this.

Solution 1 (disfavored):

public class kanjiDrill{ // a chunk of Swing GUI or something

public void runDrill(Vector kanjiEntries){

KanjiEntry currentKanjiEntry = kanjiEntries.elementAt(0); // except really I will pick one randomly

char theKanji = currentKanjiEntry.getKanji();

String theReading = currentKanjiEntry.getReading();

// build and show a flashcard based on theKanji and theReading

// use a setter to change currentKanji's data based on whether the user answers correctly;

}

}

Solution 2 (favored):

public class kanjiDrill{ // a chunk of Swing GUI or something

public void runDrill(Vector kanjiEntries){

KanjiEntry currentKanjiEntry = kanjiEntries.elementAt(0); // except really I will pick one randomly

currentKanji.buildAndShowFlashcard(); // method includes updating stats

}

}

I can definitely see the advantages to this, but two potential reasons to think hard about it occur to me right away. First, if this process is carried out to a sufficient extreme the objects that hold my data end up sucking in all the functionality of my program and my objects stop resembling natural concepts.

In your shopping example, say you want to generate price tags for the items. The price tags can be generated with ONLY the raw price, because we do not want the VAT on them. They are simple GIF graphics that have the price printed on a an irregular polygon. Should all that graphics generating code really go into the item objects, or should we just get the price out of the object with a simple getter method and then make the tags?

My second concern is that the more instance methods I put into my bottom level data objects the bigger they get, and I intend to have thousands of these things in memory. Is there a balance to strike at some point?

> It's not really a setter. Outsiders are not setting

> the items price - it's rather updating its own price

> given an argument. This is exactly how it should be,

> see my above point. A breach of encapsulation would

> be: another object gets the item price, re-calculates

> it using a date it knows, and sets the price again.

> You can see yourself that pushing the date into the

> item's method is much beter than breaching

> encapsulation and getting and setting the price.

So the point is not "don't allow access to the members" (which after all you are still doing, albeit less directly) so much as "make sure that any functionality implicated in working with the members is handled within the object," right? Take your shopping example. Say we live in a country where there is no VAT and the app will never be used internationally. Then we would resort to a simple setter/getter scheme, right? Or is the answer that if the object really is pure data are almost so, then it should be turned into a standard java.util collection instead of a custom class?

Thanks for the help.

Drake

Drake_Duna at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...
# 6

> I think the key tends to be if the member is a

> property of the object or part of the implementation

> of the object. If its a property then you should

> have no issue handing out an unmodifiable version of

> it. If its part of the implementation or makeup of

> the object, then you do not pass it around.

Well, I am not about to pass out part of the implementation! I am basically talking about objects that are glorified primitives (small custom collections).

Pass out a reference to the object only if it is normally unmodifiable, otherwise pass out a clone? Never pass out a reference to a modifiable object and then use that reference to modify the object, period?

Drake

Drake_Duna at 2007-7-15 14:12:44 > top of Java-index,Other Topics,Patterns & OO Design...