Object Storage & Blob StorageImage/Video Optimization & ServingHard⏱️ ~3 min

Adaptive Bitrate Streaming: Encoding Ladders and Segment Strategy

Definition
Adaptive bitrate streaming (ABR) delivers video in small segments, each available at multiple quality levels. The player monitors network conditions and switches quality levels mid-stream to balance video quality against buffering risk.

How ABR Works

A video is split into segments (typically 2-10 seconds each). Each segment is encoded at multiple bitrates (the "encoding ladder"). A manifest file lists all segments and available qualities. Player downloads manifest, estimates bandwidth, requests appropriate quality for next segment. If bandwidth drops, player switches to lower quality. If bandwidth improves, player switches higher. Switches happen at segment boundaries to avoid mid-segment quality jumps.

Encoding Ladder Design

The encoding ladder defines available quality levels. A typical ladder: 360p at 500kbps, 480p at 1Mbps, 720p at 2.5Mbps, 1080p at 5Mbps, 4K at 15Mbps. Each step roughly doubles bitrate and storage. More rungs mean smoother quality transitions but higher storage costs. Fewer rungs mean jarring transitions but lower storage. Optimal ladder depends on content: animation needs less bitrate than action scenes at same resolution.

💡 Key Insight: Per title encoding analyzes each video to determine optimal bitrate per resolution. An animated video might look great at 1080p/2Mbps while an action movie needs 1080p/8Mbps. This can reduce storage by 30-50% versus fixed ladders.

Segment Duration Tradeoffs

Shorter segments (2 seconds) enable faster quality adaptation. Player can react to bandwidth changes quickly. But: more segments mean more HTTP requests (overhead), smaller compression efficiency (keyframe overhead), and more manifest entries. Longer segments (10 seconds) are more efficient but slower to adapt. Standard production choice: 6 seconds. Live streaming often uses 2-4 seconds for lower latency.

HLS vs DASH

HLS (HTTP Live Streaming) is the original Apple protocol. Uses .m3u8 manifest and .ts segments. Native support on Apple devices. Required for iOS playback. DASH (Dynamic Adaptive Streaming over HTTP) is an open standard. Uses .mpd manifest and .m4s segments. More flexible but no iOS native support. Production approach: encode once, package for both HLS and DASH from same encoded segments. Use CMAF (Common Media Application Format) for segment format compatible with both.

Buffer Management

Players maintain a buffer of downloaded segments. Buffer size balances: larger buffer absorbs network variations but increases startup delay and memory usage. Smaller buffer enables faster startup but risks rebuffering on network hiccups. Typical strategy: small initial buffer for fast startup (2-4 seconds), grow to larger steady state buffer (30-60 seconds). When buffer runs low, aggressively switch to lower quality. When buffer is full, opportunistically upgrade quality.

Bandwidth Estimation Challenges

Accurate bandwidth estimation is critical but difficult. Simple approach: measure download time of last segment, calculate throughput, predict future bandwidth. Problems: bandwidth varies, concurrent requests interfere, CDN behavior is inconsistent. Advanced approaches: weighted moving average of recent measurements, exclude outliers, consider buffer state in quality decisions. A conservative estimator prevents rebuffering but may serve lower quality than necessary. An aggressive estimator maximizes quality but risks rebuffering. Most players err conservative since rebuffering frustrates users more than slightly lower quality.

💡 Key Takeaways
ABR delivers video in 2-10 second segments at multiple quality levels, switching at segment boundaries based on bandwidth
Encoding ladder: 360p/500kbps to 4K/15Mbps with each step roughly doubling bitrate - per title encoding saves 30-50%
Segment duration tradeoff: 2 seconds for fast adaptation but more overhead, 6 seconds standard, 10 seconds for efficiency
Buffer strategy: small initial buffer (2-4s) for fast start, grow to 30-60s steady state, aggressive quality drops when buffer low
📌 Interview Tips
1Explain the encoding ladder concept with specific numbers: 360p/500kbps through 4K/15Mbps with per title optimization
2Describe the segment duration tradeoff and why 6 seconds is standard for VOD while live streaming uses 2-4 seconds
3Mention buffer management strategy: fast startup with small buffer then grow, aggressive quality drops to prevent rebuffering
← Back to Image/Video Optimization & Serving Overview
Adaptive Bitrate Streaming: Encoding Ladders and Segment Strategy | Image/Video Optimization & Serving - System Overflow