The HashSet class has an add(Object o) method, which is not inherited from another class. The Javadoc for that method says the following:
Adds the specified element to this set if it is not already present. More formally, adds the specified element
e
to this set if this set contains no elemente2
such that(e==null ? e2==null : e.equals(e2))
. If this set already contains the element, the call leaves the set unchanged and returnsfalse
.
In other words, if two objects are equal, then the second object will not be added and the HashSet will remain the same. However, I've discovered that this is not true if objects e
and e2
have different hashcodes, despite the fact that e.equals(e2)
. Here is a simple example:
import java.util.HashSet;
import java.util.Iterator;
import java.util.Random;
public class BadHashCodeClass {
/**
* A hashcode that will randomly return an integer, so it is unlikely to be the same
*/
@Override
public int hashCode(){
return new Random().nextInt();
}
/**
* An equal method that will always return true
*/
@Override
public boolean equals(Object o){
return true;
}
public static void main(String... args){
HashSet<BadHashCodeClass> hashSet = new HashSet<>();
BadHashCodeClass instance = new BadHashCodeClass();
System.out.println("Instance was added: " + hashSet.add(instance));
System.out.println("Instance was added: " + hashSet.add(instance));
System.out.println("Elements in hashSet: " + hashSet.size());
Iterator<BadHashCodeClass> iterator = hashSet.iterator();
BadHashCodeClass e = iterator.next();
BadHashCodeClass e2 = iterator.next();
System.out.println("Element contains e and e2 such that (e==null ? e2==null : e.equals(e2)): " + (e==null ? e2==null : e.equals(e2)));
}
The results from the main method are:
Instance was added: true
Instance was added: true
Elements in hashSet: 2
Element contains e and e2 such that (e==null ? e2==null : e.equals(e2)): true
As the example above clearly shows, HashSet was able to add two elements where e.equals(e2)
.
I'm going to assume that this is not a bug in Java and that there is in fact some perfectly rational explanation for why this is. But I can't figure out what exactly. What am I missing?
You've violated the contract of equals
/hashCode
basically:
From the hashCode()
docs:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
and from equals
:
Note that it is generally necessary to override the
hashCode
method whenever this method is overridden, so as to maintain the general contract for thehashCode
method, which states that equal objects must have equal hash codes.
HashSet
relies on equals
and hashCode
being implemented consistently - the Hash
part of the name HashSet
basically implies "This class uses hashCode
for efficiency purposes." If the two methods are not implemented consistently, all bets are off.
This shouldn't happen in real code, because you shouldn't be violating the contract in real code...
See more on this question at Stackoverflow