Caching • Eviction PoliciesHard⏱️ ~3 min

Size and Cost Aware Eviction: Optimizing for Variable Objects and Expensive Misses

Why Size and Cost Matter
Most eviction policies treat all items equally, optimizing request hit ratio. This misleads when objects vary in size or cost. A cache keeping one 10 MB object while evicting 100 10 KB objects shows low request hit ratio but excellent byte hit ratio. Caching many small cheap objects at the expense of large expensive ones optimizes request hits but increases backend cost.
Size and Cost Aware Scoring
GDSF (Greedy Dual Size Frequency) assigns priority: (frequency + cost) / size. Evict the lowest priority first, maximizing utility per byte. A 1 MB object accessed 10 times with 100ms fetch cost has different priority than a 10 KB object accessed 100 times with 10ms cost.
Implementation Considerations
Without bounds, large objects can be starved or monopolize cache. Mitigate with size class partitions, cost normalization, or dedicated large object regions. Account for true memory: 10M entries with 200 byte values, 50 byte keys, 32 byte metadata consumes 2.82 GB.
When to Use Size Aware Eviction
Essential when sizes span orders of magnitude: CDN caches (video vs API responses), ML feature stores (scalars vs embeddings). A feature store caching 1 KB scalars alongside 10 MB embeddings using LRU causes embeddings to evict many scalars. GDSF can reduce backend load by 40%. Track both request and byte hit ratios; alert on divergence.

💡 Key Takeaways

✓Request hit ratio misleads with variable sizes. One 10 MB object and 100 evicted 10 KB objects shows low request hit ratio but excellent byte hit ratio.

✓GDSF priority: (frequency + cost) / size. Evicting lowest priority maximizes utility per byte.

✓Large objects can be starved or monopolize cache. Mitigate with size class partitions, cost caps, or dedicated regions.

✓True memory: 10M entries with 200 byte values, 50 byte keys, 32 byte metadata equals 2.82 GB.

✓Use when sizes span orders of magnitude. GDSF can reduce backend load by 40% in feature stores.

📌 Interview Tips

1Explain request hit ratio vs byte hit ratio. Know when each matters.

2GDSF scoring: (frequency + cost) / size maximizes utility per byte.

3Mention mitigations: size class partitions, cost caps, dedicated large object regions.

← Back to Eviction Policies Overview