Networking & ProtocolsCDN Architecture & Edge ComputingMedium⏱️ ~3 min

How Do CDN Cache Keys, Time to Live (TTL), and Invalidation Strategies Work at Scale?

Cache keys determine which requests can share cached responses and are derived from URL path, selected query parameters, and specific headers. Proper cache key design is critical for both performance and security. Overly broad keys cause cache fragmentation where similar requests miss the cache unnecessarily. Overly narrow keys risk cache poisoning where attackers manipulate untrusted headers to pollute cache entries. Production systems strictly define which request components contribute to cache keys and normalize or strip untrusted headers. For example, User Agent might be excluded from the cache key to allow different browsers to share cached assets, while Accept Encoding is included to distinguish compressed from uncompressed responses. Content freshness management relies on Time to Live (TTL) values, conditional revalidation, and explicit purges. Common production TTL patterns include 24 hours for static assets like JavaScript bundles and Cascading Style Sheets (CSS), approximately one minute for semi dynamic content like product listings, and zero or very short TTLs for user specific data. When TTL expires, the CDN can revalidate with the origin using conditional requests (If Modified Since or If None Match headers). The stale while revalidate pattern allows serving slightly stale content while fetching fresh copies in the background, which keeps tail latencies low during origin slowness. Similarly, stale if error permits serving cached content when the origin returns errors, improving availability during incidents. For content that must update immediately, versioned URLs provide instant invalidation without distributed purges. Assets like main.js become main.abc123.js where the hash changes when content changes. The old version remains cached harmlessly while new requests fetch the new version. This approach is preferred for immutable assets. When versioning is not possible, explicit purges propagate through distributed message buses (often Kafka like systems). These systems accept eventual consistency, typically propagating purges across all global PoPs within seconds. During propagation windows, users may see mixed content versions across different PoPs. Production systems monitor purge lag per region and expose Service Level Agreements (SLAs) around purge latency.
💡 Key Takeaways
Cache keys derived from URL path, selected query parameters, and specific headers must be carefully designed to avoid fragmentation (too granular) and poisoning (including untrusted inputs)
Production TTL patterns: 24 hours for static assets, approximately one minute for semi dynamic content, with mission critical updates using versioned URLs rather than relying on TTL expiry
Stale while revalidate serves cached content while background fetching fresh copies, reducing tail latency during origin slowness; stale if error improves availability during origin failures
Versioned asset URLs (main.abc123.js) provide instant cache busting without global purge overhead, making them preferred for immutable content
Purge propagation uses distributed message buses with eventual consistency, typically completing globally within seconds but creating temporary mixed content windows across PoPs
📌 Examples
Amazon CloudFront implements request collapsing where multiple simultaneous requests for an uncached object trigger only one origin fetch per PoP, with subsequent requests waiting for and sharing that response. This protects origins during cache miss storms.
Microsoft Azure CDN allows cache key customization through rule engines. An e-commerce site might exclude the utm_source query parameter from cache keys so that traffic from different marketing campaigns shares cached product pages, dramatically improving hit ratios.
A streaming service uses versioned URLs for player JavaScript (player.v2.47.js) to push updates instantly without purges. Meanwhile, thumbnail images use 24 hour TTLs with purges only for content policy violations, accepting brief staleness for 99.9% of cases.
← Back to CDN Architecture & Edge Computing Overview
How Do CDN Cache Keys, Time to Live (TTL), and Invalidation Strategies Work at Scale? | CDN Architecture & Edge Computing - System Overflow