Networking & Protocols • Streaming Protocols (HLS, DASH, RTMP)Medium⏱️ ~3 min
How Do You Choose the Right Streaming Protocol for Your Use Case?
Choosing the right streaming protocol and architecture depends on latency requirements, scale, interactivity, and device coverage. For public live streams that need to reach millions of viewers across all device types with acceptable latency (10 to 30 seconds), the default is RTMP ingest with HLS and DASH delivery over CDN. This architecture maximizes cache efficiency, minimizes cost per viewer, and provides universal device coverage. HLS is mandatory for Apple devices (iOS, tvOS, Safari), while DASH provides an open standard with wide support on Android, smart TVs, and browsers via Media Source Extensions (MSE). Using CMAF fMP4 packaging allows a single set of segments to support both protocols, avoiding storage and cache duplication.
For interactive streams where viewers need to react to live events in near real time (auctions, live gaming with chat, Q and A sessions), 2 to 5 second latency is required. This is the domain of Low Latency HLS and Low Latency DASH using partial chunks (200 to 500 milliseconds) with HTTP chunked transfer. These protocols maintain CDN compatibility and device coverage while reducing latency by an order of magnitude compared to traditional segmented streaming. The trade off is higher infrastructure load (more requests per viewer, reduced cache efficiency) and the need to validate that the entire delivery path (origin, CDN, proxies) supports chunked delivery without buffering. Amazon IVS and similar managed services abstract this complexity, while custom implementations require careful path validation and CDN configuration.
For sub second latency where every millisecond matters (video conferencing, real time collaboration, live auctions with millisecond bidding), WebRTC is the only viable choice. WebRTC uses UDP based RTP with sub second latency but eliminates CDN caching entirely, requiring direct viewer connections to media servers. This limits scale and dramatically increases infrastructure costs, typically restricting group sizes to hundreds or low thousands of participants. For contribution (broadcaster to ingest), SRT and RIST are increasingly preferred over RTMP for long haul or unreliable networks because they use UDP with ARQ for superior loss recovery. A resilient production setup uses dual path ingest (for example, primary RTMP plus backup SRT to different POPs) with automatic failover on packet loss or latency spikes, ensuring critical events continue even if one path fails.
💡 Key Takeaways
•Default for public broadcast at scale: RTMP ingest plus HLS and DASH delivery over CDN, achieving 10 to 30 second latency with maximum cache efficiency and device coverage
•Interactive streams requiring 2 to 5 second latency: Low Latency HLS or Low Latency DASH with 200 to 500 millisecond partial chunks, trading higher request rates for reduced latency
•Sub second latency for real time interaction: WebRTC end to end, eliminating CDN caching and limiting scale to hundreds or low thousands of participants due to server fan out constraints
•Contribution over unreliable networks: SRT or RIST ingest using UDP with ARQ outperforms RTMP by avoiding TCP head of line blocking on lossy links
•Dual path ingest (different POPs or protocols) with automatic failover ensures resilience for critical events when packet loss or latency exceeds thresholds
•CMAF packaging enables single storage and cache for both HLS (Apple devices) and DASH (Android, smart TVs, browsers), halving infrastructure footprint
📌 Examples
Standard broadcast: Twitch uses RTMP ingest and HLS delivery for millions of concurrent viewers per channel, accepting 10 to 20 second latency in exchange for maximum scale and CDN cache efficiency
Interactive broadcast: Live auction platforms use Low Latency HLS or managed services like Amazon IVS to achieve 2 to 5 second latency, allowing bidders to react in near real time while supporting thousands of concurrent participants
Real time collaboration: Google Meet and Zoom use WebRTC for sub second latency in group video calls, limiting participant counts to 50 to 300 based on server capacity and accepting higher per user infrastructure cost