Networking & ProtocolsStreaming Protocols (HLS, DASH, RTMP)Medium⏱️ ~3 min

How Do Segmented HTTP Streaming and Adaptive Bitrate Work?

Segmented Streaming Fundamentals

HLS and DASH split video into time-aligned segments (typically 2-6 seconds duration) encoded at multiple quality levels called renditions (e.g., 360p, 720p, 1080p, 4K). The player first fetches a manifest file (m3u8 for HLS, MPD for DASH) that lists all available renditions and their segment URLs. Segments are immutable once published, enabling aggressive CDN caching. This architecture transforms live video into a series of static files that CDNs handle efficiently, the same infrastructure that serves web pages and images.

Adaptive Bitrate Mechanics

ABR (Adaptive Bitrate) streaming allows players to switch between renditions at segment boundaries based on network conditions. The player estimates available bandwidth by measuring download speed of recent segments. If a 720p segment downloads faster than realtime, the player may request 1080p next; if it downloads slowly, the player drops to 480p. This prevents stalls during congestion and maximizes quality when bandwidth is plentiful. The bitrate ladder (the set of renditions offered) typically spans 300 Kbps to 15 Mbps, covering everything from mobile on cellular to 4K TVs on fiber.

Latency Breakdown

Latency in segmented streaming depends on segment duration and buffering strategy. Traditional players buffer 3 segments before starting playback for stability. With 6-second segments, this creates ~18 seconds baseline latency; 2-second segments yield ~6 seconds. Adding encoding, packaging, CDN propagation, and network jitter typically results in 10-30 seconds end-to-end latency for traditional HLS/DASH. This is acceptable for broadcast content but problematic for interactive streams.

Low Latency Modes

LL-HLS (Low Latency HLS) and LL-DASH introduce partial segments or chunks (200-500ms) published via HTTP chunked transfer encoding. Instead of waiting for complete segments, players request partial chunks as they are generated, reducing latency to 2-5 seconds while preserving CDN compatibility. This requires every hop in the delivery path (origin, CDN, proxies) to support chunked delivery without buffering. CMAF (Common Media Application Format) using fMP4 (fragmented MP4) allows single storage to serve both HLS and DASH manifests, halving storage and cache footprint.

Key Trade-off: Shorter segments improve latency and ABR responsiveness but increase request overhead. 6-second segments mean 0.17 requests/second/viewer; 2-second segments mean 0.5 requests/second/viewer. At 1 million viewers, that is the difference between 170K and 500K segment RPS.
💡 Key Takeaways
HLS/DASH split video into 2-6 second segments encoded at multiple renditions; players fetch manifests listing segment URLs
ABR switches renditions at segment boundaries based on measured bandwidth, spanning 300 Kbps to 15 Mbps bitrate ladders
Traditional latency: 3 segment buffer times segment duration (6s segments = 18s baseline, plus encoding/CDN overhead)
LL-HLS/LL-DASH use 200-500ms partial chunks with HTTP chunked transfer, reducing latency to 2-5 seconds
📌 Interview Tips
1Explain latency calculation: buffer depth (3 segments) times segment duration plus encoding and CDN propagation
2Mention CMAF/fMP4 enabling single storage for both HLS and DASH, halving cache footprint
3Discuss how shorter segments improve ABR responsiveness but increase request rate proportionally
← Back to Streaming Protocols (HLS, DASH, RTMP) Overview