Failure Modes and Edge Cases in Production Semantic Search
Embedding Drift
When you update your embedding model, all existing vectors become incompatible. The new model produces vectors in a different semantic space - even identical text gets different coordinates. Documents embedded with model v1 will not match queries embedded with v2. The vectors speak different languages.
The solution is re-embedding your entire corpus when changing models. For millions of documents, this takes hours to days. Plan model updates carefully: verify quality improvements justify the re-embedding cost.
Query-Document Length Mismatch
Short queries and long documents may not align well. A 5-word query captures limited context; a 2000-word document's embedding averages over many concepts. Models trained specifically for retrieval (e5, bge) use asymmetric training that optimizes for short query to long document matching.
Out-of-Domain Queries
Models trained on general text may fail on specialized domains. Medical terminology or legal jargon might not embed correctly. Test your model on representative domain queries before deployment. If generic models fail, fine-tune on domain data.
False Confidence
Semantic search always returns results, even for nonsense queries. There is no "no results found." A query about "quantum banana teleportation" returns the closest documents, even if none are relevant. Set minimum similarity thresholds. Monitor if users frequently click result 5+ (indicating top results were not helpful).