CachingEviction PoliciesEasy⏱️ ~3 min

Eviction Policy Fundamentals: Why Pure LRU and LFU Are Rarely Enough

Eviction policies determine which cached entries to remove when a cache reaches its capacity limit, directly impacting hit ratio and backend load. The goal is to maximize utility by retaining entries that provide the most value, whether measured by hit ratio, cost savings, or latency reduction. A cache eviction decision must balance multiple factors: access recency (temporal locality), access frequency (long term popularity), object size in bytes, and miss penalty (the cost to refetch from the backend). Production systems almost never use pure Least Recently Used (LRU) or Least Frequently Used (LFU) alone because real world workloads contain complex patterns that simple policies handle poorly. Effective production caching architectures combine three layers of policies. First, an admission policy decides whether to admit a new item at all, filtering out one hit wonders that would waste cache space. Second, an eviction policy chooses which resident item to remove when space is needed. Third, expiration and Time To Live (TTL) mechanisms handle correctness by invalidating stale data regardless of capacity pressure. For example, a system might use TinyLFU admission to filter scan traffic, segmented LRU for eviction to balance recency and protection from pollution, and TTL based expiration to maintain freshness guarantees. This layered approach provides robustness across diverse workload characteristics. The constraints of production systems drive practical design choices. Eviction operations must run in O(1) or amortized O(1) time to avoid latency spikes at millions of queries per second. Metadata overhead must stay minimal, typically tens of bytes per entry, because a cache with 10 million entries and 50 bytes of metadata per entry consumes 500 MB just for bookkeeping. Lock contention must be minimized through sharding or lock free techniques. The policy must exhibit predictable behavior under bursty access patterns, highly skewed Zipfian distributions, and scan heavy workloads that read large keysets sequentially. These requirements explain why approximate algorithms and hybrid designs dominate in practice.
💡 Key Takeaways
Eviction policies must consider four factors: access recency, access frequency, object size in bytes, and miss penalty. Optimizing for recency alone makes the cache vulnerable to scans; optimizing for frequency alone causes slow adaptation to workload shifts.
Production systems layer three mechanisms: admission policy (should we cache this), eviction policy (what to remove), and TTL expiration (when does data become invalid). Meta Memcache combines LRU eviction per slab class with an LRU crawler for expiration and slab rebalancing to prevent fragmentation.
Pure LRU adapts quickly to changing hot sets but sequential scans evict the entire working set, collapsing hit rate to near zero. Pure LFU retains long term popular items but without counter decay, stale once hot entries persist indefinitely and prevent adaptation.
Operations must be O(1) to handle millions of requests per second. Metadata overhead must be minimal: with 10 million entries at 50 bytes overhead each, metadata alone consumes 500 MB. Lock contention must be managed through sharding by key hash.
At 100,000 queries per second (QPS) with a 10 millisecond miss penalty, reducing miss rate from 20% to 16% saves 4,000 backend requests per second and approximately 40 seconds of backend CPU per wall clock second across a fleet.
Real workloads exhibit power law distributions, temporal bursts, and scan patterns. A robust policy must handle all three without catastrophic degradation. Approximate algorithms trade small accuracy loss for massive performance gains and better scalability under concurrency.
📌 Examples
Amazon DynamoDB Accelerator (DAX) uses TTL plus LRU eviction to provide microsecond read latencies at millions of requests per second per cluster. The TTL maintains correctness with eventually consistent reads while LRU handles memory pressure. Lowering miss ratio by a few percentage points at million RPS scale reduces tens of thousands of database round trips per second.
Meta (Facebook) Memcache serves billions of requests per day with peak rates in tens of millions of operations per second and sub millisecond median GET latencies. Eviction is LRU like per slab class (fixed size allocation buckets), with an LRU crawler cleaning expired items and slab rebalancing to prevent memory fragmentation from stranding space in underutilized size classes.
← Back to Eviction Policies Overview
Eviction Policy Fundamentals: Why Pure LRU and LFU Are Rarely Enough | Eviction Policies - System Overflow