Search & Ranking SystemsSearch Autocomplete (Trie)Easy⏱️ ~2 min

What is a Trie and Why Autocomplete Uses It

Definition
A Trie (prefix tree) stores strings by sharing common prefixes. Each node represents one character; walking from root to a node spells out a prefix. This makes "find all words starting with X" extremely fast, typically completing in under 1 millisecond even for dictionaries with millions of entries.

The Filing Cabinet Analogy

Imagine organizing a phone book of 10 million entries into nested folders. The first folder is labeled "A", containing subfolders "AA", "AB", "AC" and so on. Inside "AM" you find "AMA", "AME", "AMO". To find "Amazon", you open A, then M, then A, then Z, then O, then N, touching exactly 6 folders. You never touch folders for "B" through "Z" because they are irrelevant to your prefix. This is why autocomplete feels instant: as you type each character, the system narrows possibilities by traversing exactly one edge per keystroke.

Why Tries Win for Autocomplete

When a user types "ama", the trie walks exactly 3 edges to reach the "ama" node, regardless of whether the dictionary contains 100,000 or 50 million words. From that node, all words starting with "ama" (amazon, amaze, amateur, amalgamate) are descendants reachable by continuing traversal. The lookup is O(k) where k is prefix length, typically 3 to 10 characters. With average edge traversal taking 20 to 50 nanoseconds (a pointer dereference plus cache lookup), a 6 character prefix resolves in 120 to 300 nanoseconds.

Contrast with Brute Force Scanning

Without a trie, autocomplete requires scanning the entire dictionary: does apple start with ama? No. Does amazon start with ama? Yes. Does amaze start with ama? Yes. Continue for all 50 million words. With 50 million words averaging 8 characters, that is 400 million character comparisons per query, taking 50 to 100 ms on modern hardware. With a trie, the same query takes 0.1 ms, a 500 to 1000x improvement that makes real time keystroke response possible.

What Each Node Stores

At minimum: child pointers (one per possible next character, typically 26 for lowercase English or 256 for ASCII) and a boolean flag indicating "a word ends here." The node for "a-m-a-z-o-n" has its word end flag set; the node for "a-m-a" does not (unless "ama" itself is a valid word). Production systems store additional metadata at word ending nodes: search frequency (how often users search for this term), recency weighting, category tags, and often precomputed top K completions to avoid traversing descendants at query time.

💡 Key Takeaways
Trie traversal is O(P) where P is prefix length, independent of dictionary size; 3 character prefix walks 3 pointers whether you have 1000 or 10 million words
Edge traversal takes 20 to 50 nanoseconds; a 6 character prefix resolves in 120 to 300 nanoseconds total
Brute force scanning 50 million words takes 50 to 100 ms; trie lookup takes 0.1 ms, a 500 to 1000x improvement
Each node stores child pointers (26 for lowercase English or 256 for ASCII) plus word end flag
Production nodes store metadata: search frequency, recency, category tags, and precomputed top K completions
📌 Interview Tips
1Draw the trie structure: root with 26 children, each child with 26 children. Walk through finding amazon by traversing A then M then A then Z then O then N.
2Compare complexities: O(k) for trie where k is prefix length versus O(n times m) for brute force where n is dictionary size and m is average word length.
← Back to Search Autocomplete (Trie) Overview