Skip to main content
CodeFlow
data structures~18 minRoute 02 5 of 9

Hash Tables

Find anything instantly without searching the whole drawer.

Prerequisites:The Queue

Story — the librarian who skips the search

#0#1#2#3#4the formula picks the drawerbucket #2

Imagine a library of a million books, each titled with a single word. You want The Selfish Gene. A normal librarian would walk the shelves until they spot it.

A hash table librarian is different. Before any book is shelved, she runs the title through a quick formula — a hash function — that returns a drawer number. Every book lives in exactly the drawer its title hashes to. To find a book, she runs the same formula on the title and walks straight to that drawer. No browsing.

Hashy says

I never search the whole library. The formula tells me which drawer to open. If two books hash to the same drawer, I keep them together on a little chain inside it — that's a collision, and it's normal.

See it — set, get, watch the bucket light up

Type a key and a value. Press Set to insert. The hash function lights up the bucket it landed in. Try a few keys until two land in the same bucket — that's a collision.

hash(key) = sum(char codes) % 5

  1. 0
    empty
  2. 1
    empty
  3. 2
    empty
  4. 3
    empty
  5. 4
    empty

Empty table — every bucket is open.

✎ Sharpen your pencil

In one sentence, explain to a friend why looking up "apple" in a hash table doesn't get slower as you add more entries.

Hashy says

Watch the bucket light up. The formula tells me exactly where to put the value, and exactly where to find it again.

Try it — predict the bucket

Exercise. Using the toy hash sum(char codes) % 5 from the animation, predict which bucket each key lands in:

  • "cat" → ?
  • "act" → ?
  • "dog" → ?
Reveal answers

cat = 99+97+116 = 312, 312 % 5 = 2. act = 97+99+116 = 312, also 2 (collision — anagrams hash the same!). dog = 100+111+103 = 314, 4. This is why real hash functions mix the order of characters in too.

Code it — five languages, one drawer

Every mainstream language ships a hash table — Python calls it dict, Java HashMap, JavaScript Map (or a plain object). The shape is identical: set a key to a value, look it up by key, ask whether a key exists.

prices = {}
prices["apple"] = 1.20
prices["bread"] = 2.50
print(prices["apple"])  # 1.20
print("milk" in prices)  # False

Build the snippet by clicking tokens from the pool, in order. Click a slot to send a token back.

PoolClick to place
SlotsClick to send back
  1. empty
  2. empty
  3. empty
  4. empty
  5. empty
  6. empty
Hint: Subscript the dict, then assign — same shape as a list, but the index is a key.

Walk through it line by line

Step 1 / 4python
1prices = {}
2prices["apple"] = 1.20
3prices["bread"] = 2.50
4print(prices["apple"])

Quiz it — make it stick

  1. Question 1

    You have a hash table with 1,000,000 entries. About how many comparisons does it take to look up a key?

  2. Predict — Question 2

    What does this print in Python?

    d = {}
    d["x"] = 1
    d["x"] = 2
    print(d["x"])
  3. True or false — Question 3

    Two different keys can never end up in the same bucket of a hash table.

No Dumb Questions

Real questions other learners asked on this page.

  • What if two keys hash to the same bucket?
    That's a collision. The table handles it — typically by chaining (every bucket is a tiny list and you walk it) or open addressing (try the next bucket until you find an empty one). With a good hash function, collisions are rare and chains stay short, so lookup stays close to constant time.
  • Why are dict keys usually strings or numbers — not lists?
    Because the key gets hashed, and the hash has to be the same every time you ask. If you used a list as a key and then mutated it, its hash would change and the table would lose track of your value. Most languages forbid mutable keys for exactly this reason.
  • Is a Python dict "ordered"?
    Since Python 3.7, yes — keys are kept in insertion order. But that's a separate guarantee bolted onto the hash table; the underlying lookup is still by hash, not by position. Don't rely on a dict for ordering in older Pythons or other languages.