When Should You Choose LSM Trees Over B+ Trees?
Write Optimization
LSM (Log-Structured Merge) trees optimize for write throughput by buffering writes in memory before flushing to disk. Writes go to two places: (1) a memtable, an in-memory sorted structure that accepts new data, and (2) a write-ahead log (WAL), an append-only file that survives crashes. When the memtable reaches 64-256MB, it flushes to disk as an SSTable (Sorted String Table), an immutable file of sorted key-value pairs.
Performance Characteristics
LSM trees deliver 50K-100K writes/sec because all disk writes are sequential (appending to files), avoiding the random page updates that slow B+ trees to 10K-50K writes/sec. The trade-off: reads may need to check multiple SSTables to find a key. This "read amplification" is mitigated by Bloom filters, probabilistic structures that quickly answer "definitely not here" or "maybe here" for each file.
Compaction Strategies
Compaction merges SSTables to reduce files reads must check. Size-tiered: merges files of similar size into larger ones, minimizing write cost but leaving many files for reads to check. Leveled: organizes files into levels where each level is 10x larger, ensuring fewer files per key lookup but requiring more rewrites. Under heavy writes, compaction can fall behind, causing "write stalls" where new writes block.
When to Choose
Use LSM for write-heavy workloads: logs, metrics, time-series. Use B+ trees for read-heavy workloads with frequent updates: transactional systems, interactive apps.