Learn→ML-Powered Search & Ranking→Dense Retrieval (BERT-based Embeddings)→5 of 6

ML-Powered Search & Ranking • Dense Retrieval (BERT-based Embeddings)Hard⏱️ ~3 min

Dense Retrieval Failure Modes and Mitigation Strategies

Out-of-Distribution Queries
Dense retrievers generalize poorly to query types unseen during training. A model trained on natural questions fails on code snippets, product IDs, or specialized jargon. Symptoms: low recall on specific query segments, user complaints about obvious misses. Detection: segment queries by type, measure recall per segment. Mitigation: include diverse query types in training data, fall back to sparse retrieval for detected OOD queries, or use hybrid retrieval by default.
Embedding Drift
When you update the encoder model, old document embeddings become incompatible with new query embeddings. Even small model changes shift the entire vector space. Symptoms: recall drops after model update despite better offline metrics. Prevention: always re-encode all documents when updating the encoder. This is expensive but absolutely necessary. Track embedding version with documents; reject queries against mismatched versions or maintain multiple index versions during transitions.
False Semantic Similarity
Embeddings place superficially similar but semantically different texts close together. "How to kill a process" and "how to kill a person" might have high similarity. "Apple iPhone" and "Apple fruit" might cluster together. The model learned surface patterns, not true meaning. Mitigation: include contrastive pairs in training that are lexically similar but semantically different; use domain-specific fine-tuning to separate false positives.
💡 Monitoring: Track per-query-type recall weekly. Sudden drops indicate model degradation or distribution shift. Always A/B test model changes against production baseline before full rollout.

💡 Key Takeaways

✓OOD queries: models fail on query types not in training (code, IDs, jargon); segment and measure recall

✓Mitigation for OOD: include diverse training data, fall back to sparse, or use hybrid default

✓Embedding drift: model updates invalidate old embeddings; always re-encode all documents

✓False similarity: lexically similar but semantically different texts cluster (Apple iPhone vs fruit)

✓Monitor per-query-type recall weekly; A/B test model changes before full rollout

📌 Interview Tips

1Explain embedding drift problem when discussing model updates - shows production awareness

2Describe OOD failure with specific examples (code snippets, product IDs)

3Mention false semantic similarity problem with Apple example for nuanced understanding

← Back to Dense Retrieval (BERT-based Embeddings) Overview