ML-Powered Search & Ranking • Query Understanding (Intent, Parsing, Rewriting)Medium⏱️ ~3 min
Query Rewriting for Improved Recall and Precision
Query rewriting reformulates the user's original text to resolve ambiguity, normalize variations, and expand or relax terms to improve retrieval quality. A rewrite might transform "mens dress shirts" into a canonical form like "men's dress shirt" or expand it to include synonyms like "formal shirts" and "button down shirts." The goal is to bridge the vocabulary gap between how users express needs and how content is indexed.
Systems combine surface analysis with behavioral evidence. Surface methods normalize word forms using lemmatization, remove stop words like "the" and "for," split compounds, and standardize token order. Behavioral methods group queries that lead to similar engagement. Click and dwell patterns reveal that "mens dress shirts" and "dress shirts for men" are functionally equivalent because users click and purchase the same items, while "dress shirt" and "shirt dress" are not equivalent despite lexical similarity. Google and Amazon use co-click graphs where queries are nodes and edges represent shared clicked items. Queries with high edge weight, typically Jaccard similarity above 0.6 to 0.7, are candidates for equivalence.
Rewrites must be faithful to user intent, compact, and fast to compute. Teams apply guardrails including maximum token inflation factor of 1.5 to 2 times the original length, minimum semantic similarity to the original of 0.75 to 0.85 measured by embedding cosine distance, and closed world constraints for sensitive attributes like brand and category. Aggressive synonym expansion can dilute precision. Expanding "nike running shoes" to include "athletic footwear" might increase recall by 20 percent but decrease precision by 10 percent if users specifically want Nike brand. Systems use confidence thresholds and A/B testing to validate that rewrites improve Click Through Rate (CTR) and conversion without increasing zero result rate or abandonment.
💡 Key Takeaways
•Behavioral methods use co-click graphs where queries are nodes and edges represent shared clicked items. Queries with Jaccard similarity above 0.6 to 0.7 are candidates for equivalence, revealing that "mens dress shirts" and "dress shirts for men" lead to the same purchases.
•Rewrite guardrails include maximum token inflation factor of 1.5 to 2 times original length, minimum semantic similarity of 0.75 to 0.85 measured by embedding cosine distance, and closed world constraints for brand and category.
•Aggressive synonym expansion can increase recall by 20 percent but decrease precision by 10 percent. Expanding "nike running shoes" to "athletic footwear" dilutes brand intent and requires confidence thresholds and A/B validation.
•Systems maintain abstention thresholds to keep original text when confidence is low, typically below 0.6 to 0.7, preserving long tail and niche intents that normalization might erase.
•Multi query fanout generates up to 2 or 3 alternative rewrites for ambiguous queries, runs retrieval in parallel, and merges results with learned aggregation that penalizes duplicates and ensures diversity.
📌 Examples
Amazon: query "iphone charger" rewritten to "Apple iPhone charging cable" with brand normalization, increases precision by 12 percent and reduces zero results by 8 percent in A/B test over 2 weeks with 10 percent traffic split.
Google Search: query "how to fix leaky faucet" expanded to include "repair dripping tap" and "faucet leak solutions," increases recall by 18 percent with semantic similarity 0.81, improves Click Through Rate (CTR) from 0.22 to 0.26.
Airbnb: query "apartmnt paris" corrected to "apartment paris" using spell check, links to canonical location Paris France, prevents zero results and improves conversion by 9 percent for misspelled queries.