Bloom Filters in LSM Storage Engines: Read Path Optimization
The LSM Read Path Problem
Bloom filters are a foundational component in Log Structured Merge (LSM) tree storage engines. LSM trees organize data into immutable Sorted String Table (SSTable) files on disk, with writes buffering in memory (memtable) and periodic compaction merging files. A point lookup for a key may need to check multiple SSTables sequentially until the key is found or all candidates are exhausted. Without Bloom filters, each SSTable check requires a disk seek or block read, costing 5 to 10ms on spinning disks or 100 to 500 microseconds on SSDs. For keys not present in the database (negative lookups), you would check every SSTable, incurring multiple expensive I/O operations per query.
Per SSTable Filtering
Each SSTable stores a Bloom filter for its key set in memory or memory mapped file, sized to approximately 1% false positive rate (typically 9 to 10 bits per key). During a read, the engine checks the memtable first, then for each SSTable candidate, checks its Bloom filter. If the filter says definitely absent (returns negative), the engine skips that SSTable disk access entirely. With 1% false positive rate, 99% of negative lookups per SSTable avoid disk I/O. In a read path consulting 3 to 5 SSTables, Bloom filters can eliminate 2 to 5 disk seeks per miss, transforming multi millisecond tail latencies into sub millisecond responses.
Memory and CPU Trade offs
The trade off is memory and CPU cost. A node storing 10 SSTables with 100 million keys each requires approximately 1.2 GB of memory for Bloom filters at 1% false positive rate. The CPU cost is approximately 7 hash computations per lookup per SSTable checked. Compaction policies directly affect Bloom filter efficiency: size tiered compaction can create many overlapping SSTables, increasing the number of filters checked per lookup and shifting cost from I/O to CPU. Leveled compaction maintains fewer overlapping files, reducing filter checks but requiring more frequent compaction I/O.
Operational Considerations
Operators must balance compaction write amplification against read path filter overhead. For very high query rates (hundreds of thousands of QPS), the CPU cost of checking many filters can become significant, necessitating blocked Bloom filters or caching of filter results. Bloom filters provide negligible benefit for range scans because scans must read multiple sequential keys from disk regardless. Prefix Bloom filters or block level filters can help range queries by filtering at block granularity but add complexity.