Caching • Bloom FiltersHard⏱️ ~3 min

Bloom Filters in LSM Storage Engines: Read Path Optimization

The LSM Read Path Problem
Bloom filters are a foundational component in Log Structured Merge (LSM) tree storage engines. LSM trees organize data into immutable Sorted String Table (SSTable) files on disk, with writes buffering in memory (memtable) and periodic compaction merging files. A point lookup for a key may need to check multiple SSTables sequentially until the key is found or all candidates are exhausted. Without Bloom filters, each SSTable check requires a disk seek or block read, costing 5 to 10ms on spinning disks or 100 to 500 microseconds on SSDs. For keys not present in the database (negative lookups), you would check every SSTable, incurring multiple expensive I/O operations per query.
Per SSTable Filtering
Each SSTable stores a Bloom filter for its key set in memory or memory mapped file, sized to approximately 1% false positive rate (typically 9 to 10 bits per key). During a read, the engine checks the memtable first, then for each SSTable candidate, checks its Bloom filter. If the filter says definitely absent (returns negative), the engine skips that SSTable disk access entirely. With 1% false positive rate, 99% of negative lookups per SSTable avoid disk I/O. In a read path consulting 3 to 5 SSTables, Bloom filters can eliminate 2 to 5 disk seeks per miss, transforming multi millisecond tail latencies into sub millisecond responses.
Memory and CPU Trade offs
The trade off is memory and CPU cost. A node storing 10 SSTables with 100 million keys each requires approximately 1.2 GB of memory for Bloom filters at 1% false positive rate. The CPU cost is approximately 7 hash computations per lookup per SSTable checked. Compaction policies directly affect Bloom filter efficiency: size tiered compaction can create many overlapping SSTables, increasing the number of filters checked per lookup and shifting cost from I/O to CPU. Leveled compaction maintains fewer overlapping files, reducing filter checks but requiring more frequent compaction I/O.
Operational Considerations
Operators must balance compaction write amplification against read path filter overhead. For very high query rates (hundreds of thousands of QPS), the CPU cost of checking many filters can become significant, necessitating blocked Bloom filters or caching of filter results. Bloom filters provide negligible benefit for range scans because scans must read multiple sequential keys from disk regardless. Prefix Bloom filters or block level filters can help range queries by filtering at block granularity but add complexity.

💡 Key Takeaways

✓LSM engines store one Bloom filter per SSTable in memory, sized to 1% false positive rate (9 to 10 bits per key). For 100 million keys per SSTable, this requires approximately 120 MB per file.

✓Point lookups check Bloom filter before accessing each SSTable. With 1% FP rate, 99% of negative lookups per SSTable avoid disk I/O.

✓In read path checking 3 to 5 SSTables, filters eliminate 2.97 to 4.95 disk accesses per miss, reducing tail latency from 15 to 25ms to under 1ms.

✓Memory cost: 10 SSTables with 100M keys each requires 1.2 GB for Bloom filters. CPU cost: 7 hash operations per SSTable per query.

✓Compaction strategy impacts filter overhead: size tiered creates more overlapping SSTables (more filter checks), leveled reduces filter checks but increases compaction write amplification.

📌 Interview Tips

1In a system design interview, explain the read path: memtable first, then Bloom filter check per SSTable. This shows understanding of LSM architecture and filter integration.

2Quantify the benefit: without filters, checking 5 SSTables at 5ms each equals 25ms. With 1% FP rate, expected disk touches drop to 0.05 per SSTable, reducing total to under 1ms.

3Mention the trade off between size tiered vs leveled compaction and how it affects Bloom filter CPU cost versus write amplification.

← Back to Bloom Filters Overview