How Do CDN Cache Keys, Time to Live (TTL), and Invalidation Strategies Work at Scale?
Cache Key Design
Cache keys determine which requests can share cached responses. A cache key is derived from components of the request: URL path, selected query parameters, and specific headers. Two requests with identical cache keys receive the same cached response; different keys trigger separate origin fetches. Proper cache key design is critical for both performance (maximizing cache hit ratio) and security (preventing cache poisoning where attackers inject malicious cached content).
Overly broad keys cause cache fragmentation: including unnecessary parameters means similar requests miss the cache. For example, including a random session tracking parameter makes every request unique, effectively disabling caching. Overly narrow keys risk cache poisoning: if untrusted headers like X-Forwarded-Host contribute to cache keys without validation, attackers can manipulate them to store malicious responses. Production systems strictly define which components contribute to cache keys and normalize values (lowercase hostnames, sort query parameters) while stripping untrusted headers.
TTL Management
TTL (Time To Live) specifies how long a cached response remains valid before the CDN must revalidate or refetch from origin. Common production TTL patterns include 24 hours for static assets like JavaScript bundles and CSS (files that rarely change), 1 to 5 minutes for semi dynamic content like product listings (balance freshness with cache hits), and 0 or very short TTLs for user specific data that cannot be cached.
When TTL expires, the CDN can revalidate with origin using conditional requests. The If-Modified-Since header asks the origin if the content changed since a given timestamp. If unchanged, origin returns 304 Not Modified (a small response) and the CDN extends the TTL without transferring the full content. This reduces bandwidth for content that expires often but changes rarely.
Stale Content Strategies
The stale-while-revalidate pattern serves slightly stale content to users while fetching fresh content from origin in the background. This keeps tail latencies low during origin slowness: the user sees cached content immediately while the CDN asynchronously updates its cache. stale-if-error permits serving cached content when origin returns errors (5xx responses or timeouts), improving availability during origin incidents.
Both patterns accept bounded staleness in exchange for better performance and resilience. The staleness window is typically short (seconds to minutes) and configurable. For content where any staleness is unacceptable (financial data, security critical responses), these patterns should not be used.
Invalidation Strategies
Versioned URLs provide instant invalidation without distributed purges. Assets like main.js become main.abc123.js where the hash changes when content changes. The old version remains harmlessly cached (nobody requests it anymore) while new requests fetch the new version. This is the preferred approach for immutable assets because it requires no global coordination and provides instant updates.
When versioning is not possible, explicit purges propagate through distributed message systems. Purge requests flow to all PoPs, which remove the specified content from their caches. This uses eventual consistency: purges typically complete globally within seconds to tens of seconds, but during propagation users may see mixed content versions across different PoPs. Production systems monitor purge lag per region and expose SLAs (Service Level Agreements) around purge latency.