Ranking Signals and Personalization in Autocomplete
Frequency Alone Is Not Enough
If autocomplete ranked purely by global search frequency, every query would suggest the same top 10 terms. User types "ca": suggestions would always be globally most popular terms instead of contextually relevant options like "california" or "calculator". Ranking must blend multiple signals: frequency, recency, click through rate (CTR), personalization, and context.
Signal 1 Query Frequency
How often do users search for this term? A term searched 100,000 times daily ("amazon") outranks one searched 100 times ("amazonia"). Frequency data from query logs, aggregated hourly or daily. Raw counts are log scaled: log(frequency + 1) compresses dynamic range while preserving ordering, preventing viral trends from dominating.
Signal 2 Recency and Time Decay
A term trending in the last hour matters more than one popular last month. Apply time decay: e^(-lambda times age). With lambda equals 0.01 and age in hours, 24 hours retains 79 percent weight; 7 days retains only 19 percent. This surfaces trending terms quickly while maintaining baseline for stable popular terms.
Signal 3 Context
What was the previous query? If user searched "python", prefix "pa" should suggest "pandas" over "pancakes". Context windows of 1 to 5 previous queries improve relevance. Maintain per query pair co occurrence statistics, boost terms frequently following the recent query.
Signal 4 Personalization
A developer typing "py" wants "python"; a chef wants "pyrex". User history (past 30 to 90 days) creates personal prior. Blend: score = alpha times personal + (1-alpha) times global, alpha typically 0.2 to 0.4. Higher alpha risks filter bubbles. Balance personal signals 20 to 30 percent with global 70 to 80 percent for discovery.
Signal Combination
Combine multiplicatively: freq^0.5 times recency_decay times (1 + context_boost) times (1 + personal_boost). Square root on frequency prevents domination by popular terms. Tune via A/B tests measuring CTR; good autocomplete achieves 30 to 50 percent CTR on first suggestion.