Caching • CDN CachingHard⏱️ ~3 min

CDN Cache Stampede and Thundering Herd Mitigation

Cache Stampede in CDN Context
Occurs when popular cached object expires or is purged, and thousands of concurrent requests discover it missing simultaneously. Without coordination, each request generates an origin fetch. For globally distributed CDN with 200 edge PoPs, a single hot object expiring can trigger 200 simultaneous origin connections within milliseconds. If each request takes 200ms RTT and object is 5 MB, origin suddenly faces 1 GB egress and hundreds of queries in a 200ms window.
Request Coalescing
Collapses concurrent misses for same cache key into single origin fetch. When multiple edges miss simultaneously, parent tier (Origin Shield) receives all requests but issues only one to origin. Reduces origin requests by 90%+ during stampedes. Implementation: track in flight requests per key, hold arriving requests until first completes, serve all from that response.
Stale While Revalidate
Serve slightly stale content (within soft TTL) while background worker refreshes from origin. Users get immediate 20-50ms response from stale cache while refresh happens asynchronously. Eliminates user visible latency impact of cache refresh. Configure two TTLs: fresh period (serve directly), stale grace period (serve stale, trigger refresh).
TTL Jitter
Randomize expiration by ±10-20% to prevent synchronized expiration storms. Instead of 1000 objects expiring at exactly 12:00:00, spread over 11:48:00-12:12:00. Formula: TTL = base + random(-0.1*base, 0.1*base). Spreads refresh load over time window rather than creating traffic spikes.

💡 Key Takeaways

✓Stampede: 200 PoPs expire simultaneously = 200 origin requests in milliseconds. 5MB object = 1GB egress spike in 200ms window.

✓Request coalescing: multiple misses for same key collapse into single origin fetch. Reduces origin load by 90%+ during stampedes.

✓Stale while revalidate: serve stale content while refreshing in background. User gets 20-50ms response, refresh happens async.

✓TTL jitter: ±10-20% randomization prevents synchronized expiration. 1000 objects spread over 24 minute window vs instant spike.

📌 Interview Tips

1Coalescing: 50 edges miss popular video, parent receives 50 requests, issues 1 to origin, serves all 50 from that response.

2Stale while revalidate: s-maxage=60, stale-while-revalidate=30. After 60s, serve stale up to 90s while refreshing.

3Jitter formula: 1 hour TTL with 10% jitter = 54-66 minute actual TTL, spreading 1000 expirations over 12 minutes.

← Back to CDN Caching Overview