Learn→Embeddings & Similarity Search→Real-time Updates (Incremental Indexing)→5 of 6

Embeddings & Similarity Search • Real-time Updates (Incremental Indexing)Hard⏱️ ~3 min

Model Evolution and Dual Indexing

THE MODEL UPDATE PROBLEM
Embedding models improve over time. A newer model might produce better quality embeddings, improving search relevance. But switching models means every existing vector is now invalid—embeddings from different models are incompatible. You cannot compare a vector from model v1 against a vector from model v2.
Full re-embedding is expensive. For 100M items, if embedding takes 10ms per item, re-embedding takes ~12 days of continuous compute. During this time, you need the old index serving traffic while building the new one.
DUAL INDEX STRATEGY
Shadow index: Build new index using new model in background while old index serves traffic. Once complete, run quality validation (offline A/B test or shadow scoring). If quality improves, switch traffic to new index.
Traffic cut-over: Start with 1% traffic to new index. Monitor latency, errors, and quality metrics. Gradually increase to 100% over hours or days. This catches regressions before full rollout.
Rollback plan: Keep old index available for 1-2 weeks after switch. If quality issues emerge, instant rollback by routing traffic back to old index.
QUERY-TIME MODEL MIGRATION
Alternative approach: keep both indexes running permanently. At query time, embed the query with both models, search both indexes, and merge results. This avoids atomic cutover but doubles compute and storage costs.
When to use: if embedding quality is highly item-dependent (some items work better with model v1, others with v2), merging results from both may outperform either alone. Otherwise, dual index overhead is rarely worth it.
💡 Key Insight: Model updates are index rebuilds. Budget for shadow indexing infrastructure—you need capacity to run two full indexes during transitions.

💡 Key Takeaways

✓Model updates invalidate all existing vectors—embeddings from different models are incompatible

✓Re-embedding 100M items at 10ms each takes ~12 days; need shadow index running while old serves traffic

✓Traffic cut-over: start 1% on new index, monitor quality, gradually increase; keep old index for rollback

📌 Interview Tips

1Interview Tip: Explain that model updates are index rebuilds—budget shadow index infrastructure for transitions.

2Interview Tip: Describe gradual traffic cut-over strategy with rollback capability.

← Back to Real-time Updates (Incremental Indexing) Overview