Object Storage & Blob Storage • Image/Video Optimization & ServingMedium⏱️ ~2 min
On Demand vs Precomputed Derivatives: Architecture Trade offs
The fundamental architectural choice in media optimization is whether to generate derivatives on demand when requested or precompute them ahead of time. This decision cascades through every aspect of the system including storage costs, latency profiles, compute capacity, and operational complexity.
On demand transformation means the first request for a specific variant triggers the transformation in real time. The advantage is storage efficiency and flexibility. You only store what is actually requested, avoiding variant explosion where a single asset balloons into dozens of derivatives (for example, 6 widths times 2 DPR levels times 2 formats times 2 quality levels equals 48 variants per asset). New formats or sizes are immediately available by changing URL parameters. However, the tradeoff is unpredictable tail latency. Cache misses force expensive CPU or GPU operations in the request path, with p95 latency spiking from 50 milliseconds to 300 to 800 milliseconds. You also face thundering herd risk where viral content or a cache purge triggers thousands of concurrent transformations for the same derivative.
Precomputed derivatives flip these tradeoffs. You generate common variants at upload time or via background jobs, storing them alongside the original. This delivers predictable low latency because every request is a cache hit, and compute cost shifts from request time to asynchronous batch jobs. The downside is the variant explosion problem becomes real. A platform serving 100 million assets with 48 variants each stores 4.8 billion derivatives, multiplying storage costs and complicating cache invalidation. Adding support for a new format like AVIF requires a backfill campaign to regenerate billions of assets.
Production systems often use a hybrid approach. Precompute the top 5 to 10 most common variants (for example, mobile and desktop hero sizes in WebP and JPEG) to cover 80 to 90 percent of traffic. Handle long tail requests on demand with request coalescing to prevent duplicate work and admission control to shed load under spikes. imgix processes approximately 1 billion images per day using primarily on demand transformations with aggressive edge caching to achieve greater than 95 percent hit rates, demonstrating that with sufficient cache warmth, on demand can match precomputed latency for hot content.
💡 Key Takeaways
•On demand transformations store fewer variants and support new formats immediately via URL parameters, but cache misses add 100 to 400 milliseconds of transformation latency and require CPU or GPU capacity with admission control to handle request spikes
•Precomputed derivatives deliver predictable low latency with all requests being cache hits, but variant explosion (48 or more derivatives per asset) increases storage costs linearly and complicates invalidation workflows when source assets update
•Thundering herd scenarios occur when viral content or cache purges trigger thousands of concurrent expensive transforms for the same derivative, requiring request coalescing (single flight per unique variant) and circuit breakers on upstream transformation pools
•Hybrid strategies precompute the top 5 to 10 most requested variants covering 80 to 90 percent of traffic while handling long tail requests on demand, balancing storage costs against compute and latency
•Cache key normalization is critical in on demand systems to prevent duplicate storage of functionally identical derivatives, requiring stable parameter ordering and canonical representation of transformation configs
•Background regeneration campaigns are necessary when adding new formats in precomputed architectures, where platforms backfill top N assets offline and roll out format negotiation gradually to avoid runtime compute spikes
📌 Examples
A social media platform with 100 million assets and 48 precomputed variants per asset stores 4.8 billion derivatives, requiring petabytes of storage and complex invalidation logic when users update profile pictures
imgix achieves greater than 95 percent edge cache hit rates with on demand transformations by serving approximately 1 billion images per day through aggressive CDN caching, demonstrating that hot content performs equivalently to precomputed with sufficient cache warmth
A video platform precomputes H.264 and VP9 encodes for the top 10,000 most watched videos (covering 70 percent of views) while encoding long tail content on first request with 2 to 5 minute transcoding latency absorbed by asynchronous workflows