CDN Failure Modes: Fragmentation, Poisoning, and Negative Caching
Cache Key Fragmentation
Too many dimensions in cache key create explosion of variants. Including full User-Agent header generates thousands of variants per URL (every browser version, OS, device combination). Vary on Accept-Language without normalization creates dozens of variants (en-US, en-GB, en-AU all treated as separate keys). Result: hit ratio drops from 90% to under 40%, high memory churn, frequent evictions that further reduce hit ratio. Fix: whitelist only necessary headers, normalize to coarse categories (map all English locales to "en"), limit query parameter variations to essential ones.
Cache Poisoning
User controlled query parameters or headers cached without validation allow attackers to inject content served to other users. A leaked cookie based personalization cached for all users is a privacy violation. XSS (Cross Site Scripting) payloads in query parameters get cached and served to victims visiting the same URL. Prevention: never cache responses with Set-Cookie header, validate and sanitize cache key inputs, use signed URLs for sensitive content, implement strict Cache-Control: private for personalized responses to prevent CDN caching.
Negative Cache Poisoning
Caching 404 Not Found or error responses (5xx) masks newly created content or temporary failures. User creates resource, but CDN has cached 404 from earlier request, user sees not found for entire TTL duration despite resource existing. Fix: very short TTL for error responses (1-10 seconds), or do not cache 4xx/5xx responses at all. Monitor negative cache hit rates; sudden spikes indicate application bugs or attack patterns.
Purge Propagation Failures
Invalidation requests fail to reach all PoPs due to network issues, queue saturation, or PoP unavailability. Some users see stale content while others see fresh, creating inconsistent experience. Monitor purge completion rates and latency across all PoPs. Backup strategy: version based URLs (asset-v2.js) where changing filename creates new cache key, completely bypassing purge propagation complexity.