Object Storage & Blob StorageImage/Video Optimization & ServingEasy⏱️ ~3 min

What is Image and Video Optimization in System Design?

Image and video optimization is the production pipeline that transforms a single high quality source asset into many derivatives tailored for different devices, network conditions, and display contexts. The goal is to serve the smallest possible file that maintains acceptable visual quality, minimizing bandwidth costs and improving user experience. The core optimization levers work at different layers. For images, you resize to match the actual display dimensions (accounting for device pixel ratio or DPR), compress to the lowest perceptually acceptable quality, and choose the optimal format based on client capabilities. For example, serving AVIF to modern browsers or WebP as a fallback instead of JPEG can reduce file sizes by 25 to 34 percent with no visible quality loss. For video, the biggest savings come from efficient encoding ladders and adaptive bitrate (ABR) delivery, where players dynamically switch between quality levels based on available bandwidth. Operationally, the system sits as a proxy between origin storage and Content Delivery Network (CDN) edges. On a cache hit, the CDN serves the derivative in 20 to 50 milliseconds. On a miss, a transformation service fetches the original, applies the requested transformations, returns the asset, and caches it for future requests. The cache key must capture all transformation parameters and client capabilities to avoid serving the wrong variant. Quality is measured using perceptual metrics rather than file size alone. For images, Structural Similarity Index (SSIM) or Multi Scale SSIM (MS SSIM) guide compression decisions. For video, Video Multimethod Assessment Fusion (VMAF) scores determine optimal bitrates. Production systems target just enough quality at each resolution, often achieving 30 to 60 percent size reductions compared to naive quality settings.
💡 Key Takeaways
Transformation pipeline creates device specific derivatives from a single source, optimizing three dimensions: sizing (matching display resolution and DPR), compression (using perceptual quality metrics like SSIM or VMAF), and format selection (AVIF, WebP, VP9, AV1 for modern clients)
CDN edge caching is critical for performance, with well tuned systems achieving greater than 95 percent cache hit rates and 20 to 50 millisecond Time To First Byte (TTFB) at metropolitan points of presence compared to 100 to 400 milliseconds on cold misses
Format negotiation yields substantial savings with real production numbers showing WebP being 25 to 34 percent smaller than JPEG, AVIF adding another 10 to 20 percent reduction, and animated GIF to MP4 conversion saving 50 to 80 percent
Video optimization relies on adaptive bitrate streaming with 2 to 4 second segments across multiple resolutions, targeting under 2 second startup latency and under 1 to 2 percent rebuffer rates while delivering the lowest bitrate that maintains target VMAF scores
Cache keys must encode all transformation parameters and client capabilities to prevent serving wrong variants, requiring normalized parameter ordering and content negotiation based on Accept headers
Production systems at scale like imgix process approximately 1 billion images per day, demonstrating the architectural pattern of stateless transformation proxies behind aggressive edge caching with hardware optimized for graphics processing
📌 Examples
A travel site reduced a homepage background video from 150 MB to 6 MB (96 percent reduction) using adaptive encoding and resolution downscaling, materially improving conversion rates
Cloud transformation services applying quality auto and format auto commonly achieve 30 to 60 percent size reductions, such as a 2 MB JPEG compressed to 918 KB (over 50 percent) without visible quality loss
Netflix uses per title and per shot encoding to achieve 20 to 50 percent bitrate reduction at equal perceptual quality compared to static ladders, with short 2 to 4 second segments enabling fast ABR adaptation
← Back to Image/Video Optimization & Serving Overview
What is Image and Video Optimization in System Design? | Image/Video Optimization & Serving - System Overflow