Networking & ProtocolsStreaming Protocols (HLS, DASH, RTMP)Hard⏱️ ~3 min

How Do You Dimension and Scale Production Streaming Infrastructure?

Capacity planning for streaming systems requires modeling bandwidth egress, request rates, and origin load separately. Bandwidth egress is the product of average viewer bitrate and concurrent viewers. For example, 100,000 concurrent viewers at an average 3 Mbps generates 300 Gbps sustained egress. A 2 hour event at this scale delivers approximately 2.7 petabytes of data. At typical CDN rates of $0.02 to $0.08 per GB depending on region and volume commits, bandwidth alone costs $54,000 to $216,000, excluding transcoding, origin, and operational costs. Netflix global peak traffic exceeds several hundred terabits per second delivered via Open Connect Appliances within Internet Service Providers (ISPs), with individual appliances serving tens of Gbps. Request rate planning is more complex because it depends on segment duration, manifest update frequency, and rendition count. With 2 second segments, each viewer generates approximately 0.5 segment requests per second. For 100,000 viewers, that is 50,000 segment RPS. Manifest requests add another layer: if players poll media playlists every 2 seconds, that is another 50,000 manifest RPS. With 6 renditions in the ABR ladder and players probing multiple renditions during adaptation, peak request rates can reach 200,000 to 300,000 RPS. Without proper CDN caching (manifest TTL 2 to 6 seconds, segment TTL hours to days) and multi tier origin architecture, the origin will collapse under this load. Production systems use cache shielding where edge CDN nodes route cache misses to regional shield nodes, which then route to origin, reducing origin request fan in by 10 to 100x. Storage and packaging efficiency is critical at scale. A 4 hour live event with 6 renditions at 4 second segments generates 3,600 seconds divided by 4 equals 900 segments per rendition, totaling 5,400 segments. With HLS TS and DASH MP4 formats, you store and cache 10,800 distinct files. Using CMAF, you store 5,400 fMP4 chunks and publish both HLS and DASH manifests referencing the same chunks, halving storage and cache footprint. For origin and CDN costs, this translates to significant savings. Multi region packaging with deterministic segment naming allows immediate failover without player confusion: if origin A fails, DNS or CDN routing switches to origin B, and because segment URLs are identical (for example, chunk_t1234567890_v720p.m4s), CDN caches remain valid and players continue without stalls.
💡 Key Takeaways
Bandwidth formula: average bitrate times concurrent viewers. Example: 3 Mbps times 100,000 equals 300 Gbps sustained egress; 2 hours equals 2.7 PB costing $54,000 to $216,000
Request rate: 100,000 viewers with 2 second segments generate 50,000 segment RPS plus 50,000 manifest RPS, scaling to 200,000 to 300,000 total RPS with multi rendition probing
Multi tier origin with cache shielding reduces origin request fan in by 10 to 100x; edge CDN routes cache misses to regional shields before origin
CMAF packaging halves storage and cache footprint by eliminating duplicate TS (HLS) and MP4 (DASH) files; a 4 hour event drops from 10,800 to 5,400 segments
Netflix Open Connect delivers hundreds of Tbps globally using ISP embedded appliances serving tens of Gbps each, achieving startup under 1 to 2 seconds and rebuffer ratio under 0.5 to 1 percent
Deterministic segment naming enables instant multi region origin failover without CDN cache invalidation or player disruption
📌 Examples
A 1 million viewer live sports event at 3 Mbps average bitrate for 2 hours: 3 Mbps times 1 million equals 3 Tbps sustained egress, 2.7 PB total delivery, $54,000 to $216,000 bandwidth cost, plus $10,000 to $50,000 transcoding cost for 6 renditions
YouTube Live uses multi tier manifest caching: edge CDN caches for 2 to 4 seconds, regional shields aggregate requests, origin serves manifest updates at 10,000 to 50,000 RPS even with millions of viewers
Amazon IVS automatically scales ingest, transcode (generating 4 to 6 renditions), packaging (LL-HLS with partial chunks), and multi CDN delivery for channels handling 10,000 to 100,000 plus concurrents
← Back to Streaming Protocols (HLS, DASH, RTMP) Overview
How Do You Dimension and Scale Production Streaming Infrastructure? | Streaming Protocols (HLS, DASH, RTMP) - System Overflow