Size and Cost Aware Eviction: Optimizing for Variable Objects and Expensive Misses
Why Size and Cost Matter
Most eviction policies treat all items equally, optimizing request hit ratio. This misleads when objects vary in size or cost. A cache keeping one 10 MB object while evicting 100 10 KB objects shows low request hit ratio but excellent byte hit ratio. Caching many small cheap objects at the expense of large expensive ones optimizes request hits but increases backend cost.
Size and Cost Aware Scoring
GDSF (Greedy Dual Size Frequency) assigns priority: (frequency + cost) / size. Evict the lowest priority first, maximizing utility per byte. A 1 MB object accessed 10 times with 100ms fetch cost has different priority than a 10 KB object accessed 100 times with 10ms cost.
Implementation Considerations
Without bounds, large objects can be starved or monopolize cache. Mitigate with size class partitions, cost normalization, or dedicated large object regions. Account for true memory: 10M entries with 200 byte values, 50 byte keys, 32 byte metadata consumes 2.82 GB.
When to Use Size Aware Eviction
Essential when sizes span orders of magnitude: CDN caches (video vs API responses), ML feature stores (scalars vs embeddings). A feature store caching 1 KB scalars alongside 10 MB embeddings using LRU causes embeddings to evict many scalars. GDSF can reduce backend load by 40%. Track both request and byte hit ratios; alert on divergence.