How does HashSet work internally, and how does it ensure unique elements?

At first glance, HashSet feels simple. You add elements, and duplicates are silently ignored. But under the hood, there is a very intentional design that explains both its performance and its behavior.

HashSet does not store elements directly. Internally, it is backed by a HashMap. Every element you add to a HashSet becomes a key in that map, paired with a constant dummy value. This means all the rules that apply to HashMap keys also apply to HashSet elements.

When you call add, two things happen. First, Java computes the hashCode of the element. This hash is used to decide which bucket the element belongs to. Then, Java checks equality using equals to determine whether an element with the same identity already exists in that bucket.

This combination is what guarantees uniqueness. If two objects have the same hashCode and equals returns true, HashSet considers them duplicates and ignores the new insertion. If either the hashCode is different or equals returns false, the element is treated as unique.

This design explains a few important behaviors. HashSet does not preserve insertion order because hashing is about distribution, not sequence. It also explains why overriding hashCode and equals correctly is critical. A broken implementation can lead to duplicated data or elements that cannot be found later.

In real systems, this matters more than it seems. Think about processed transaction IDs, unique user identifiers, or idempotency keys in financial services. HashSet offers a fast and reliable way to enforce uniqueness, but only if the objects stored respect the hashCode and equals contract.

HashSet looks simple on the surface, but its power comes from well defined rules. Understanding those rules is what turns a convenient collection into a reliable tool at scale.