Object Storage & Blob StorageImage/Video Optimization & ServingHard⏱️ ~3 min

Failure Modes: Cache Poisoning, Thundering Herds, and Unbounded Transforms

Cache Poisoning

Cache poisoning occurs when a CDN caches the wrong content for a URL. For image optimization: client requests /image.jpg, transformation service returns WebP, CDN caches WebP. Next client does not support WebP but receives cached WebP anyway. Root cause: missing or incorrect Vary header. Fix: always set Vary: Accept on format-negotiated responses. CDN creates separate cache entries per Accept value. Verify CDN supports Vary header caching (some do not). Alternative: encode format in URL path /image.webp instead of content negotiation.

Thundering Herd on New Images

A popular new image is posted. Thousands of users request it simultaneously before any variant is cached or generated. All requests hit origin transformation service. Each triggers compute intensive transformation. Origin CPU saturates, latency spikes, requests timeout. Mitigation strategies: request coalescing (hold duplicate requests while first one transforms, then serve all from result), async generation (return placeholder, generate in background, client retries), pre warming (generate common variants immediately on upload before any requests). Request coalescing is most effective but requires coordination layer tracking in flight transformations.

⚠️ Key Trade-off: Request coalescing adds complexity and potential single point of failure. If the first request fails, all coalesced requests fail. Implement with timeout and fallback to direct processing.

Unbounded Transformations

Attacker requests /image.jpg?w=10000&h=10000. Transformation attempts to create massive image, exhausts memory, crashes. Or requests /image.jpg?w=1 then ?w=2 then ?w=3 to fill cache with millions of variants. Mitigations: allowlist dimensions (only accept predefined sizes: 100, 200, 400, 800, 1600), max dimension limits (reject any dimension above 4000px), signed URLs (transformation parameters must be cryptographically signed), rate limiting per source image. Production systems use allowlisted dimensions plus signed URLs for custom transformations.

SSRF via Image URL

Some services fetch images from user provided URLs for processing. Attacker provides internal URL: http://169.254.169.254/metadata (cloud metadata service). Service fetches URL, returns internal data. This is Server Side Request Forgery (SSRF). Mitigations: allowlist domains (only fetch from known sources), blocklist internal IP ranges, use dedicated network segment for fetching with no internal access, validate response is actually an image before processing. SSRF via image processing is common vulnerability in media pipelines.

Codec Vulnerabilities

Image and video parsers are complex and historically vulnerable. Malformed image file can trigger buffer overflow in decoder, leading to code execution. Defense: use hardened, sandboxed decoders. Run transformation in isolated containers. Keep codec libraries updated (new vulnerabilities discovered regularly). Consider preprocessing with simple validator before full decode. Rate limit transformations per source IP to limit exploitation attempts.

Runaway Encoding Costs

Video encoding is expensive. A bug or attack triggers mass re-encoding. Cloud encoding bill spikes from $1,000/day to $50,000/day. Mitigations: budget alerts at 2x and 5x normal, hard caps that stop encoding when exceeded, rate limits on encoding job submissions, monitoring for unusual encoding patterns (same video encoded repeatedly, abnormal video lengths). Investigate before lifting caps. A legitimate viral video is different from an attack or bug.

💡 Key Takeaways
Cache poisoning from missing Vary header serves wrong format to unsupported clients - use Vary: Accept or encode format in URL path
Thundering herd on new images: use request coalescing to hold duplicates, serve all from first result when complete
Unbounded transformations: allowlist dimensions (100, 200, 400, 800, 1600), max limits, signed URLs for custom sizes
SSRF via image URL: attacker provides internal URL like metadata service - use domain allowlist and blocklist internal IPs
📌 Interview Tips
1Explain cache poisoning and why Vary: Accept header is critical for format negotiation systems
2Describe request coalescing for thundering herd: hold duplicate requests, first completes generates, all receive result
3Present dimension security: allowlist common sizes plus signed URLs for custom transformations to prevent cache flooding
← Back to Image/Video Optimization & Serving Overview