What is Cache Invalidation and Why Does It Matter?
The Freshness vs Performance Trade off
Caching speeds reads by avoiding origin hits but creates stale copies. The core challenge: deciding how and when to mark entries invalid or refresh them. Fast responses favor keeping data cached longer; current data favors frequent invalidation. Production systems handle 1 billion+ lookups per second at sub millisecond latency, with global purges completing in 150ms-1s.
Three Decision Axes
Timing: invalidate based on time (TTL, where Time To Live defines when data expires), when data changes (event driven), or by checking freshness on every access. Data flow: update cache synchronously with writes (write through), asynchronously after writes (write behind), or only on reads after invalidation (cache aside). Granularity: invalidate individual keys, groups using tags, or bump a generation counter invalidating everything in a namespace.
Hybrid Approaches in Production
Most systems use hybrids: cache aside reads with TTL as safety net (5-60 min for static content, 1-10s for dynamic), plus explicit event driven invalidations for correctness critical keys (permissions, inventory), and generational namespaces to avoid massive fan out when one change affects many entries.