CachingCDN CachingHard⏱️ ~3 min

CDN Failure Modes: Fragmentation, Poisoning, and Negative Caching

Cache Key Fragmentation

Too many dimensions in cache key create explosion of variants. Including full User-Agent header generates thousands of variants per URL (every browser version, OS, device combination). Vary on Accept-Language without normalization creates dozens of variants (en-US, en-GB, en-AU all treated as separate keys). Result: hit ratio drops from 90% to under 40%, high memory churn, frequent evictions that further reduce hit ratio. Fix: whitelist only necessary headers, normalize to coarse categories (map all English locales to "en"), limit query parameter variations to essential ones.

Cache Poisoning

User controlled query parameters or headers cached without validation allow attackers to inject content served to other users. A leaked cookie based personalization cached for all users is a privacy violation. XSS (Cross Site Scripting) payloads in query parameters get cached and served to victims visiting the same URL. Prevention: never cache responses with Set-Cookie header, validate and sanitize cache key inputs, use signed URLs for sensitive content, implement strict Cache-Control: private for personalized responses to prevent CDN caching.

Negative Cache Poisoning

Caching 404 Not Found or error responses (5xx) masks newly created content or temporary failures. User creates resource, but CDN has cached 404 from earlier request, user sees not found for entire TTL duration despite resource existing. Fix: very short TTL for error responses (1-10 seconds), or do not cache 4xx/5xx responses at all. Monitor negative cache hit rates; sudden spikes indicate application bugs or attack patterns.

Purge Propagation Failures

Invalidation requests fail to reach all PoPs due to network issues, queue saturation, or PoP unavailability. Some users see stale content while others see fresh, creating inconsistent experience. Monitor purge completion rates and latency across all PoPs. Backup strategy: version based URLs (asset-v2.js) where changing filename creates new cache key, completely bypassing purge propagation complexity.

💡 Key Takeaways
Cache key fragmentation: unbounded Vary headers drop hit ratio from 90% to under 40%. Whitelist necessary headers, normalize values.
Cache poisoning: user input in cache keys serves attacker content to victims. Never cache Set-Cookie, use Cache-Control: private for personalized.
Negative cache poisoning: cached 404s mask newly created content. Use 1-10s TTL for errors, or do not cache 4xx/5xx at all.
Purge failures: invalidation does not reach all PoPs. Version based URLs (asset-v2.js) bypass purge by creating new cache key.
📌 Interview Tips
1Fragmentation: Vary: User-Agent creates 1000+ variants per URL. Normalize to mobile/desktop/tablet (3 variants) instead.
2Poisoning: attacker adds ?evil=<script>alert(1)</script> to URL, response cached, all users receive XSS payload.
3Version URLs: instead of purging style.css, rename to style-v2.css. All caches serve new file automatically, no purge needed.
← Back to CDN Caching Overview