Cache Aside: The Default Pattern for Read Heavy Systems

Definition
Cache aside (also called lazy loading) places the application in full control of both cache population and invalidation. On reads, the application first checks the cache; on miss, it loads data from the database, populates the cache with a TTL (Time To Live, the duration after which the cached entry expires), and returns the data. On writes, the application writes to the database first, then invalidates the cache key.
The Read Path in Detail
When a request arrives, the application follows a precise sequence. First, check the cache for the requested key. On a cache hit, return immediately with typical latency of 0.5-2ms for distributed caches. On a cache miss, query the database, which takes 5-50ms depending on query complexity. Store the result in cache with a TTL, then return the data. The TTL determines staleness tolerance: set it too short (30 seconds) and you flood the database with cache misses; set it too long (1 hour) and users see stale data after updates. Most systems start with 5-15 minute TTLs. Adding TTL jitter (randomizing expiry by 10-20%) prevents synchronized expiry that causes thundering herd problems.
The Write Path: Delete Not Update
When data changes, write to the database first, then delete the corresponding cache key. This ordering is essential. If you delete cache first, a concurrent reader might repopulate the cache with old data before your database write completes, leaving stale data cached until TTL expires. The pattern is delete on write, not update cache on write. Updating cache directly creates race conditions: Thread A and Thread B both update the same record; Thread A updates database first, Thread B second with newer data; but due to network timing, Thread B updates cache first, then Thread A overwrites cache with older data. Now cache and database are inconsistent until TTL expires. Deletion avoids this because the next reader fetches fresh data from the database.
Production Implementation Patterns
Large scale deployments use lease tokens to prevent thundering herds. When a key is missing, the first requester acquires a lease (a short lived lock); other requesters wait for the lease holder to populate the cache rather than all querying the database simultaneously. A two tier architecture with L1 per host cache plus L2 distributed cache reduces network hops for hot data. Multiget operations batch multiple key fetches in a single round trip, reducing network overhead when fetching related data. These optimizations enable systems to handle billions of cache operations per second with sub-millisecond p95 latencies and hit ratios exceeding 90%.
When Cache Aside Excels
Cache aside is ideal when read to write ratios are high (10:1 or greater), you need fine grained control over what gets cached, your cache infrastructure is separate from your database (allowing independent scaling), and you can tolerate eventual consistency where cached data might be stale until TTL expires. The pattern works exceptionally well with denormalized key designs where you cache precomputed views.
Key Trade-off: Cache aside gives you complete control but requires implementing miss handling logic in every application. The next pattern (read through) centralizes this logic in the cache layer itself.

💡 Key Takeaways

✓Read path: check cache (0.5-2ms hit), on miss query database (5-50ms), populate cache with TTL, return data

✓Write path: write database first, then delete cache key; this ordering prevents stale data from concurrent reads

✓Delete on write avoids race conditions where concurrent updates could leave stale data cached indefinitely

✓Lease tokens prevent thundering herd: first requester acquires lock, others wait for cache population instead of all hitting database

✓Typical TTLs range 5-15 minutes with 10-20% jitter; shorter means more database load, longer means more stale reads

✓Best for read heavy workloads (10:1 ratio or higher) where eventual consistency is acceptable

📌 Interview Tips

1Explain the delete vs update trade-off: two concurrent updates can result in older data overwriting newer data in cache; deletion forces fresh fetch

2Describe lease tokens: on miss, acquire lease with 5s TTL; if lease held, wait 50-100ms and retry cache get; prevents 10,000 concurrent database queries

3Walk through the race condition: Thread A slow read, Thread B fast write, Thread A late cache populate with stale data; this is why we delete not update

← Back to Cache Patterns (Aside, Through, Back) Overview