What is Image and Video Optimization in System Design?

Definition
Image and video optimization transforms media files to minimize size while maintaining acceptable quality, then serves appropriate variants based on client capabilities. This reduces bandwidth costs, improves page load times, and enables playback on diverse devices.
Why Optimization Matters at Scale
Media dominates internet traffic. Images account for roughly 50% of average webpage weight. Video streaming consumes over 80% of global internet traffic. Serving unoptimized originals wastes bandwidth and money. A 5MB photo can be delivered as a 200KB WebP to mobile users without visible quality loss. A 4K video stream unnecessary for a user on a 480p phone screen wastes 10x bandwidth.
The Three Optimization Dimensions
Format optimization selects the most efficient encoding. AVIF is 50% smaller than JPEG at equivalent quality. WebP is 30% smaller. But not all clients support modern formats. Resolution optimization serves appropriately sized images. A 200px thumbnail does not need a 4000px source. Quality optimization reduces compression level. Quality 85 is often visually identical to 100 but 40% smaller. Combining all three: format, resolution, and quality can reduce file size by 90%+.
💡 Key Insight: Optimization is not about reducing quality. It is about eliminating waste: unnecessary pixels, inefficient encoding, unneeded precision. Users cannot perceive the removed data.
Video Optimization Complexity
Video adds temporal dimension. Key concepts: codec determines compression algorithm (H.264 universal but dated, H.265 50% smaller but licensing issues, AV1 open and efficient but CPU intensive). Bitrate controls quality and size tradeoff. Resolution from 360p to 4K. Adaptive streaming switches between quality levels based on network conditions. A single video becomes multiple encoded variants forming an "encoding ladder."
Client Capability Detection
Optimization requires knowing client capabilities. Format support detected via Accept header (includes supported image formats) or JavaScript feature detection. Viewport size provided by srcset attribute or client hints headers. Network quality inferred from Save-Data header, Network Information API, or client hints like Downlink and RTT. Device pixel ratio determines if retina images are needed. Modern CDNs can parse these signals and route to appropriate variants automatically.
The Serving Challenge
One source image might need to be served as: 100px thumbnail in WebP for mobile, 400px card in JPEG for old browsers, 1200px detail in AVIF for modern desktop. Multiplied across millions of images, managing all variants becomes complex. Two approaches: pre-generate all variants on upload (higher storage, lower latency), or generate on demand at request time (lower storage, higher complexity). Most production systems use hybrid: popular sizes pre-generated, rare sizes on demand.

💡 Key Takeaways

✓Images are 50% of webpage weight, video is 80% of internet traffic - optimization eliminates waste without perceived quality loss

✓Three dimensions: format (AVIF 50% smaller than JPEG), resolution (serve appropriately sized), quality (85 vs 100 saves 40%)

✓Video adds complexity: codec choice, bitrate, adaptive streaming with encoding ladders for multiple quality levels

✓Client capability detection via Accept header, srcset, client hints enables serving optimal variants automatically

📌 Interview Tips

1Explain the three optimization dimensions and how combining all three can reduce file size by 90%+

2When discussing video, mention the encoding ladder concept where one source becomes multiple quality variants

3Describe client capability detection methods: Accept header for format, srcset for resolution, client hints for network

← Back to Image/Video Optimization & Serving Overview