DSA Hash Sets

DSA Tutorial

DSA Hash Sets – Complete Beginner Guide

(Concept • Operations • Examples • Implementation • Complexity)

A Hash Set is a hash-based data structure used to store unique elements only.
It’s extremely useful when you only need to check existence or remove duplicates—fast.

1️⃣ What is a Hash Set?

A Hash Set:

  • Stores values only (no key–value pairs)

  • Does not allow duplicates

  • Uses a hash function internally

  • Does not maintain order

Example

Insert: 5, 10, 5, 20
Stored: 5, 10, 20

2️⃣ Why Use a Hash Set?

✔ Fast membership checks
✔ Automatic duplicate removal
✔ Clean logic for set-based problems

Best for:

  • Checking if an element exists

  • Removing duplicates

  • Union / intersection problems


3️⃣ Hash Set vs Hash Table (Quick Compare)

Feature Hash Set Hash Table
Stores Values only Key–Value pairs
Duplicates ❌ Not allowed ❌ Keys not allowed
Average Time O(1) O(1)
Use Case Uniqueness Mapping

4️⃣ How Hash Set Works Internally

Internally, a Hash Set uses a hash table:

  1. Value → hash function

  2. Hash → index (bucket)

  3. Store value in that bucket

  4. Collisions handled via chaining or probing

value → hash(value) → index → bucket

5️⃣ Basic Operations

  • add(x) → Insert if not present

  • remove(x) → Delete if present

  • contains(x) → Check existence

  • size() → Number of unique elements


6️⃣ Hash Set Example (C++ – STL)

Using unordered_set:


 

Output (order may vary):

10 20 30
Contains 20? Yes

7️⃣ Remove Duplicates Using Hash Set


 


8️⃣ Hash Set – Basic Implementation Idea (Concept)



 

📌 Real implementations also resize the table to keep operations fast.


9️⃣ Time Complexity

Operation Average Worst Case
Insert O(1) O(n)
Search O(1) O(n)
Delete O(1) O(n)

📌 Worst case happens when many elements collide.


🔟 Space Complexity

  • O(n) to store elements

  • Extra space for buckets/collision handling


1️⃣1️⃣ Load Factor (Important)

Load Factor = number_of_elements / table_size
  • High load factor → more collisions

  • Hash sets resize automatically to maintain performance


1️⃣2️⃣ Real DSA Uses

✔ Remove duplicates
✔ Check duplicates
✔ Find union / intersection
✔ Cycle detection
✔ Two-sum (existence check)


 Interview FAQs

Q1. Does Hash Set allow duplicates?
👉 No

Q2. Is Hash Set ordered?
👉 No (unordered)

Q3. Why is Hash Set fast?
👉 Hash-based indexing (O(1) average)

Q4. When not to use Hash Set?
👉 When order matters or memory is tight


Summary

✔ Hash Set stores unique values
✔ Uses hashing internally
✔ Average O(1) operations
✔ Unordered, fast, and practical
✔ Very common in interviews & coding rounds

You may also like...