Trade-offs in Inverted Index Design: Freshness, Size, and Query Speed

The Three Way Trade Off
Every inverted index design balances three competing goals: freshness (how quickly new documents become searchable), index size (storage and memory cost), and query speed (latency and throughput). You cannot optimize all three simultaneously; improving one typically degrades another.
Freshness vs Write Amplification
Want new documents searchable in 1 second? You commit tiny segments (1 to 10 MB each) constantly, triggering frequent merges. Each document might be rewritten 8 to 12 times as segments merge from 10 MB to 100 MB to 1 GB. This write amplification consumes 5 to 10x the raw document size in IO. A system ingesting 10,000 documents per second (50 MB/s raw) performs 400 to 500 MB/s disk writes due to merging. Batch indexing with 5 minute or hourly commits reduces write amplification to 2 to 3x, but content is invisible until next commit.
Size vs Query Speed
Storing more metadata speeds up queries but grows the index. Position data enables phrase search: distinguishing "cheap pizza delivery" as exact phrase from scattered mentions. But positions typically double index size: a 500 GB positional index becomes 250 GB without positions. Some systems store positions only for recent content (last 30 days) where phrase search matters most, reducing storage by 35 to 45 percent.
Compression vs Latency
Heavy compression reduces storage by 50 to 70 percent but requires decompression at query time. Decompressing a 64 KB postings block takes 50 to 200 microseconds. For a query touching 100 blocks, this adds 5 to 20 ms. Modern systems use block compression: decompress once, scan all entries. Choose based on bottleneck: high compression helps IO bound systems, lower compression wins for CPU bound.
Practical Defaults
Most systems start with: 1 to 5 second commit intervals (reasonable freshness), LZ4 compression (fast decompression, moderate size reduction), position data enabled (phrase search expected). Tune from there. If nobody uses phrase search (check query logs), drop positions and save 40 to 50 percent space. If freshness is not critical (product catalog updates daily), extend commit interval to reduce write amplification.
Monitoring Critical Metrics
Four metrics matter: segment count (above 50 indicates merge pressure), memory ratio (index size vs available RAM, target 60 to 70 percent), p99 query latency (spikes indicate hot terms or cold cache), and indexing throughput (documents per second, drops indicate merge bottleneck). When any metric trends wrong, adjust before users notice.

💡 Key Takeaways

✓Near real time (1 to 5 second commit) causes 5 to 10x write amplification; batch indexing reduces to 2 to 3x but increases staleness

✓Positional postings double index size; store positions only for recent content or phrase search use cases to save 35 to 45 percent

✓Block compression (50 to 70 percent size reduction) adds 50 to 200 microseconds per 64 KB block decompression; amortize by scanning full blocks

✓Default setup: 1 to 5 second commits, LZ4 compression, positions enabled; tune based on actual query patterns from logs

✓Monitor segment count, memory ratio, p99 latency, and indexing throughput to detect problems before users notice

📌 Interview Tips

1Explain the freshness tradeoff: 1 second commits enable near real time search but cause 8 to 12 rewrites per document. Batch commits reduce IO but increase staleness.

2Discuss positions: if query logs show low phrase query usage, dropping positions saves 40 to 50 percent storage. Always check actual usage before optimizing.

← Back to Inverted Index & Text Search Overview