nested object equality - design pattern
Looking to solve a problem in my own code, I wanted to see if and how the problem is solved in the java library.
I would have liked this code to output "true", to see how it's done.
Set set1 =new HashSet();
set1.add("a value");
set1.add(set1);
Set set2 =new HashSet();
set2.add("a value");
set2.add(set2);
System.out.println(set1.equals(set2));
Well the code ends with a StackOverflowError on hashCode(). Just using a Set which implements hashCode to return a constant value would shift the problem to the equals-method.
I think one possible solution would be to implement the hashCode method to set a instance variable (computingHash = true) while hashcodes of fields are computed. If the variable is set when hashcode is invoke a RecursiveHashRuntimeException is thrown which is is caught by the invoking hashcode method which would then return a constant value or ignore the corresponding field for hash-computation.
Similarly the equals method would add the obj (passed to equals) to a set (instance variable) named assumeEqualTo, if the obj is in assumeEqualTo when the method is invoked it returns true. The value is removed from assumeEqualTo before the method that added it returns.
This approach would require the equals and hashcode methods to be - at least partially - synchronized (if multithreading is an issue), an alternative would be to use ThreadLocal variables to detect and handle recursive invocations.
I'm not sure how the two approaches compare in terms of performance, and I would welcome any other approach to solve the problem. Note that the class being compared should not be required to know details about the contained classes and the nesting may also be indirect as in:
Set set1 =new HashSet();
Set set2 =new HashSet();
set1.add("a value");
set1.add(set2);
set2.add("a value");
set2.add(set1);
System.out.println(set1.equals(set2));
Also it would be nice if the impact on performance could be kept minimal for all instance that happens not to be self-containg.
cheers,
reto
Why are you adding a Set to itself? That's probably the root cause of the stack overflow. - Saish
Saisha at 2007-7-13 10:10:21 >

Of course that's the cause, but I still think the two sets are equal (even after looking the definition of Set.equals) and I'm looking for an algorithm that can determine this. OK, this doesn't answer your question why I'm doing it....
You would have to override equals and do the following:
> Iterate over all the elements in the collections, calling equals() on each. If any returns false, break and return false for your equals().
> While iterating, store in another collection all the items you have already processed. Perform an object equality test (==) on the item you are inspecting versus what you have already processed to catch the stack overflow issue. When you encounter this, you can either return true or skip to the next element and continue processing.
- Saish
Saisha at 2007-7-13 10:10:21 >

> You would have to override equals and do the
> following:
>
> > Iterate over all the elements in the collections,
> calling equals() on each. If any returns false,
> break and return false for your equals().
should be slightly different as for sets the order is irrelevant
>
> > While iterating, store in another collection all
> the items you have already processed. Perform an
> object equality test (==) on the item you are
> inspecting versus what you have already processed to
> catch the stack overflow issue. When you encounter
> this, you can either return true or skip to the next
> element and continue processing.
sounds similar to what I proposed, but doesn't answer the question if this collection of items that are already processed (I assume you mean by a direct or indiret invoker) is best passed using ThreadLocal or using instance variables and synchronizing the relevant part of th equals method.
reto
> > You would have to override equals and do the
> > following:
> >
> > > Iterate over all the elements in the
> collections,
> > calling equals() on each. If any returns false,
> > break and return false for your equals().
>
> should be slightly different as for sets the order is
> irrelevant
>
I don't follow. You would presumably only return true as a whole if every element in the Set also returned true. So, yes, order is irrelevant.
> >
> > > While iterating, store in another collection all
> > the items you have already processed. Perform an
> > object equality test (==) on the item you are
> > inspecting versus what you have already processed
> to
> > catch the stack overflow issue. When you
> encounter
> > this, you can either return true or skip to the
> next
> > element and continue processing.
> sounds similar to what I proposed, but doesn't answer
> the question if this collection of items that are
> already processed (I assume you mean by a direct or
> indiret invoker) is best passed using ThreadLocal or
> using instance variables and synchronizing the
> relevant part of th equals method.
>
> reto
No, I mean that if you add a Set to itself, and then iterate over it, you will implicitly be performing recursion, leading to a StackOverflowError eventually. That is why you need to store a collection (or array) with all the objects already analyzed. You need to compare what is being currently being inspected along with checking what you already processed for object equality (==) so you do not get the stack overflow.
- Saish
Saisha at 2007-7-13 10:10:21 >

> > should be slightly different as for sets the order
> is
> > irrelevant
> >
>
> I don't follow. You would presumably only return
> true as a whole if every element in the Set also
> returned true. So, yes, order is irrelevant.
just wanted to say that you must return true iff you find a mapping from set1 to set2 where all elements are equals
> No, I mean that if you add a Set to itself, and then
> iterate over it, you will implicitly be performing
> recursion, leading to a StackOverflowError
> eventually. That is why you need to store a
> collection (or array) with all the objects already
> analyzed.
What do you mean by "already analyzed"? I mean how would you prevent recursion with this?
> You need to compare what is being
> currently being inspected along with checking what
> you already processed for object equality (==) so you
> do not get the stack overflow.
I don't get it, in
Set set1 = new HashSet();
Set set2 = new HashSet();
set1.add(set2);
set2.add(set1);
set1.equals(set2);
The equals method returns true iff set2.equals(set1), which would - in the algorithm I proposed - return set1.add(set2) which is true as set2 is contained in the Set assumeEqualTo. I interpreted the contract specified by java.util.Set to return true on equals for indistinguishable object - not sure if this is correct for mutually referencing sets, but pretty convinced for non-refrencing self containing sets, as in my first example or for all sets returned by:
createSet() {
Set s1 = new HashSet();
Set s2 = new HashSet();
s2.add(s1);
s1.add(s2);
}
Imho, it would break be against the specification in java.util.Set to return false on createSet().equals(createSet()).
reto
> What do you mean by "already analyzed"? I mean how would you prevent recursion with this?
This is the crux of your problem. You need to maintain a List or Stack or whatever of previously evaluated instances of your custom Set. When you encounter an instance you have already processed, just skip over it. That's how you avoid the recursion trap. Set on its own will not do this. You need to extend an existing implementation or implement your own custom Set.
- Saish
Saisha at 2007-7-13 10:10:21 >
