When Should You Choose LSM Trees Over B+ Trees?

Write Optimization
LSM (Log-Structured Merge) trees optimize for write throughput by buffering writes in memory before flushing to disk. Writes go to two places: (1) a memtable, an in-memory sorted structure that accepts new data, and (2) a write-ahead log (WAL), an append-only file that survives crashes. When the memtable reaches 64-256MB, it flushes to disk as an SSTable (Sorted String Table), an immutable file of sorted key-value pairs.
Performance Characteristics
LSM trees deliver 50K-100K writes/sec because all disk writes are sequential (appending to files), avoiding the random page updates that slow B+ trees to 10K-50K writes/sec. The trade-off: reads may need to check multiple SSTables to find a key. This "read amplification" is mitigated by Bloom filters, probabilistic structures that quickly answer "definitely not here" or "maybe here" for each file.
Compaction Strategies
Compaction merges SSTables to reduce files reads must check. Size-tiered: merges files of similar size into larger ones, minimizing write cost but leaving many files for reads to check. Leveled: organizes files into levels where each level is 10x larger, ensuring fewer files per key lookup but requiring more rewrites. Under heavy writes, compaction can fall behind, causing "write stalls" where new writes block.
When to Choose
Use LSM for write-heavy workloads: logs, metrics, time-series. Use B+ trees for read-heavy workloads with frequent updates: transactional systems, interactive apps.

💡 Key Takeaways

✓LSM buffers writes in memtable (in-memory) + WAL (crash recovery), flushes to immutable SSTables at 64-256MB

✓Sequential disk writes enable 50K-100K writes/sec vs B+ tree 10K-50K (random page updates)

✓Read amplification: reads may check multiple SSTables; Bloom filters quickly rule out files without the key

✓Compaction merges SSTables but can cause write stalls if it falls behind under heavy load

📌 Interview Tips

1Explain memtable + WAL: writes go to memory for speed, WAL ensures no data loss if process crashes before flush

2Read amplification example: key might exist in memtable or any of 10 SSTables; worst case checks all 11

3Bloom filter intuition: asks each file is key here?, gets definitely no or maybe yes, skips definite nos

← Back to Indexing Strategies Overview