Embeddings & Similarity SearchIndex Management (Building, Updating, Sharding)Medium⏱️ ~2 min

Sharding Vector Indexes: Balancing Load and Latency

UPDATE STRATEGIES

Indexes need updates when: new content is added, existing content changes (embeddings updated), or content is deleted. Each has different solutions and tradeoffs.

Full rebuild: Regenerate entire index from scratch. Most accurate but slowest. Use for major embedding model updates or when incremental drift becomes unacceptable. Typical cadence: weekly to monthly.

Incremental update: Add new vectors to existing index structure. Fast but may degrade quality over time. HNSW supports this naturally; IVF-PQ requires assigning to existing centroids.

Hybrid: Maintain a small "delta" index for recent items, periodically merge into main index. Balances freshness and quality.

INCREMENTAL UPDATE MECHANICS

HNSW incremental: Insert new vectors by finding neighbors in existing graph, adding edges. Quality degrades slightly over time—new vectors see incomplete neighborhoods if inserted late. Rebuild when recall drops 2-3%.

IVF incremental: Assign new vectors to nearest existing centroid, add to that partition. Centroids become stale as distribution shifts. If >20% of vectors are post-training, centroids may be misaligned.

Deletion: Most indexes support soft deletion (mark as deleted, filter at query time). Hard deletion requires compaction or rebuild. Soft-delete overhead: 5-10% query slowdown as deleted vectors are still scanned.

WHEN TO REBUILD

Monitor recall on a fixed query set. When recall drops below threshold (e.g., from 0.95 to 0.92), trigger rebuild. Also rebuild after embedding model updates—old and new embeddings are incompatible.

⚠️ Key Trade-off: Incremental updates are fast but accumulate quality debt. Track recall drift and schedule periodic full rebuilds to reset quality.
💡 Key Takeaways
Full rebuild: most accurate, use for model updates; weekly to monthly cadence
Incremental: fast but quality degrades; rebuild when recall drops 2-3%
Soft deletion: filter at query time; 5-10% overhead; rebuild for compaction
📌 Interview Tips
1Interview Tip: Explain the rebuild trigger—monitor recall on fixed query set, rebuild when it drops below threshold.
2Interview Tip: Describe hybrid strategy—delta index for freshness, periodic merge into main index.
← Back to Index Management (Building, Updating, Sharding) Overview