Embeddings & Similarity Search • Index Management (Building, Updating, Sharding)Medium⏱️ ~2 min
Update Strategies: Deletes, Tombstones, and Compaction
Updating indexes in place is expensive. Vector indexes store compressed representations and graph structures that are tightly packed. Rewriting on every update would cause prohibitive write amplification. Instead, production systems apply updates via delta structures and periodic compaction.
Deletes are handled with tombstones. When an item is deleted, a marker is written to a separate structure or embedded in the index metadata. Queries honor tombstones at read time, skipping marked items. This avoids immediate compaction but inflates memory and reduces effective recall. If 20 percent of vectors are tombstoned, a search for top 100 might need to scan 125 candidates to return 100 live items. Compaction runs in the background, typically when tombstones exceed 20 to 30 percent of entries, and rebuilds the index without deleted items.
Updates are modeled as delete plus insert. The old vector is tombstoned, and the new vector is appended to a delta index. For a product catalog with 10 percent daily churn (50 million updates out of 500 million items), this generates 50 million tombstones and 50 million new vectors every day. Without compaction, memory usage would grow 10 percent daily. Nightly compaction rebuilds the main index, trains updated quantizers to match the new distribution, and resets the delta.
Merge storms occur when many small segments are compacted simultaneously. Lucene and Elasticsearch use tiered merge policies to smooth IO and CPU spikes, merging segments of similar size and throttling during peak hours. Vector indexes face similar issues: merging multiple delta graphs into the main Hierarchical Navigable Small World structure can spike CPU for minutes, degrading p99 latencies by 2x to 3x during compaction.
💡 Key Takeaways
•Tombstones defer deletion cost to read time. Each tombstoned item consumes memory and forces the query to over fetch. If 30 percent of vectors are deleted and you want top 100, you may need to retrieve 140 candidates, increasing latency by 40 percent.
•Compaction is mandatory for memory and performance. Running nightly compaction on a 500 million vector index with 10 percent daily churn reclaims 50 million tombstoned vectors, saving 1 to 2 gigabytes of memory and restoring recall.
•Write amplification from compaction can reach 3x to 5x. Every written segment is eventually rewritten during merge. Reducing refresh frequency from 1 second to 30 seconds can halve write amplification, improving throughput from 2,000 to 5,000 writes per second per shard.
•Merge policies control IO spikes. Tiered merge in Elasticsearch merges segments of similar size, avoiding pathological cases where 100 tiny segments merge at once and spike CPU for minutes.
•Consistency bugs can leak stale data. If a query skips the tombstone check due to a code path bug, deleted or restricted items can appear in results, causing legal or compliance issues.
📌 Examples
Elasticsearch applies deletes as tombstones in a .del file per segment. When tombstones exceed 30 percent, a merge removes them. Users report 5 to 10 minute merge times for 50 gigabyte segments on SSD.
Meta FAISS uses versioned embeddings. Deletes mark a version as invalid, and queries filter by version at scan time. Compaction rebuilds the index without invalid versions, typically overnight.
Spotify encountered tombstone accumulation in Annoy when delete rate spiked during content purges. Recall dropped from 98 percent to 92 percent until a manual rebuild cleared tombstones.