Search & Ranking SystemsSearch Autocomplete (Trie)Medium⏱️ ~2 min

Trie Node Metadata and Top K Precomputation

The Problem with Naive Traversal

You have a trie with 50 million words. User types "ca" and you need to show the top 10 suggestions. Naive approach: traverse all descendants of the "ca" node (perhaps 2 million words), score each by frequency, sort, return top 10. This takes 100 to 500 ms, far too slow for keystroke by keystroke response. Users expect suggestions within 50 to 100 ms of each keystroke. The descendant count grows exponentially: prefix "c" might have 5 million descendants, "ca" has 2 million, "car" has 500,000.

Precomputing Top K at Each Node

The solution: at each internal node, precompute and store the top K (typically K equals 10 to 20) most popular completions reachable from that node. The node for "ca" stores ["car", "cat", "california", "calculator"] directly. When user types "ca", immediately return these precomputed results. No descendant traversal needed. Query time drops from 200 ms to 0.1 ms, a 2000x improvement. The trade off is preprocessing time and memory.

Building the Precomputed Lists

During index construction, after inserting all words with their frequencies, run a bottom up aggregation pass. Each leaf node contributes its word and frequency to its parent. Each internal node merges its children's top K lists, keeps the best K by frequency, and propagates upward. For 50 million words with average depth 8 and K equals 10, this requires roughly 400 million merge operations, taking 5 to 10 minutes as a one time batch job. The merge is a K way merge of sorted lists, easily parallelizable across subtrees.

The Space Trade Off

Storing K completions at each of N nodes requires O(N times K) space. With 30 million unique prefixes (nodes) and K equals 10, you store 300 million word references. If each reference is a 4 byte pointer plus 4 byte frequency, that is 2.4 GB just for precomputed suggestions. Many systems store only the word ID (4 bytes) and fetch full strings from a separate dictionary at serve time. The trade off: 2 to 3x memory increase for 1000x faster query time.

Handling Updates and Freshness

When word frequencies change (a term becomes trending), precomputed lists become stale. Options: full recomputation (expensive but simple, run nightly), incremental updates (propagate frequency changes up the tree, complex bookkeeping with write amplification), or TTL based refresh (recompute subtrees whose data is older than 1 hour). Most production systems run nightly full recomputation for the main dictionary and maintain a separate hot cache for trending terms updated every few minutes. At query time, merge precomputed list from cold trie with hot trending cache by interleaving scores.

💡 Key Takeaways
Naive descendant traversal on 2 million nodes takes 100 to 500 ms; precomputed top K returns in 0.1 ms, a 2000x improvement
Each node stores top K (typically 10 to 20) most popular completions; query is O(P + K) for P prefix length
Building top K lists: bottom up aggregation merging children's lists, 400 million merge operations in 5 to 10 minutes for 50 million words
Space cost: 30 million nodes times K equals 10 times 8 bytes equals 2.4 GB; trade 2 to 3x memory for 1000x faster queries
Handle freshness with two tier architecture: nightly full rebuild for cold trie, hot cache for trending terms updated every few minutes
📌 Interview Tips
1Walk through the precomputation: leaf nodes propagate upward, parent merges children's top K and keeps best K. This is like a tournament bracket.
2Explain the freshness trade off: nightly rebuild is simple but stale; incremental updates are fresh but cause write amplification on ancestor nodes.
← Back to Search Autocomplete (Trie) Overview