Learn→Embeddings & Similarity Search→Index Management (Building, Updating, Sharding)→3 of 6

Embeddings & Similarity Search • Index Management (Building, Updating, Sharding)Medium⏱️ ~2 min

Sharding Vector Indexes: Balancing Load and Latency

UPDATE STRATEGIES
Indexes need updates when: new content is added, existing content changes (embeddings updated), or content is deleted. Each has different solutions and tradeoffs.
Full rebuild: Regenerate entire index from scratch. Most accurate but slowest. Use for major embedding model updates or when incremental drift becomes unacceptable. Typical cadence: weekly to monthly.
Incremental update: Add new vectors to existing index structure. Fast but may degrade quality over time. HNSW supports this naturally; IVF-PQ requires assigning to existing centroids.
Hybrid: Maintain a small "delta" index for recent items, periodically merge into main index. Balances freshness and quality.
INCREMENTAL UPDATE MECHANICS
HNSW incremental: Insert new vectors by finding neighbors in existing graph, adding edges. Quality degrades slightly over time—new vectors see incomplete neighborhoods if inserted late. Rebuild when recall drops 2-3%.
IVF incremental: Assign new vectors to nearest existing centroid, add to that partition. Centroids become stale as distribution shifts. If >20% of vectors are post-training, centroids may be misaligned.
Deletion: Most indexes support soft deletion (mark as deleted, filter at query time). Hard deletion requires compaction or rebuild. Soft-delete overhead: 5-10% query slowdown as deleted vectors are still scanned.
WHEN TO REBUILD
Monitor recall on a fixed query set. When recall drops below threshold (e.g., from 0.95 to 0.92), trigger rebuild. Also rebuild after embedding model updates—old and new embeddings are incompatible.
⚠️ Key Trade-off: Incremental updates are fast but accumulate quality debt. Track recall drift and schedule periodic full rebuilds to reset quality.

💡 Key Takeaways

✓Full rebuild: most accurate, use for model updates; weekly to monthly cadence

✓Incremental: fast but quality degrades; rebuild when recall drops 2-3%

✓Soft deletion: filter at query time; 5-10% overhead; rebuild for compaction

📌 Interview Tips

1Interview Tip: Explain the rebuild trigger—monitor recall on fixed query set, rebuild when it drops below threshold.

2Interview Tip: Describe hybrid strategy—delta index for freshness, periodic merge into main index.

← Back to Index Management (Building, Updating, Sharding) Overview