How Do You Dimension and Scale Production Streaming Infrastructure?

Bandwidth Egress Calculations
Capacity planning for streaming requires modeling bandwidth egress, request rates, and origin load separately. Bandwidth egress is the product of average viewer bitrate and concurrent viewers. Example: 100,000 concurrent viewers at 3 Mbps average generates 300 Gbps sustained egress. A 2-hour event at this scale delivers approximately 2.7 petabytes of data. At CDN rates of $0.02-$0.08 per GB (varying by region and volume commits), bandwidth alone costs $54,000-$216,000, excluding transcoding, origin infrastructure, and operational costs. Large-scale streaming platforms achieve peak traffic of hundreds of terabits per second globally using embedded CDN appliances within ISPs (Internet Service Providers).
Request Rate Planning
Request rate planning is more complex because it depends on segment duration, manifest update frequency, and rendition count. With 2-second segments, each viewer generates 0.5 segment requests per second. For 100,000 viewers, that is 50,000 segment RPS. Manifest requests add another layer: if players poll media playlists every 2 seconds, that is another 50,000 manifest RPS. With 6 renditions in the ABR ladder and players probing multiple renditions during adaptation, peak request rates can reach 200,000-300,000 RPS. Without proper CDN caching (manifest TTL 2-6 seconds, segment TTL hours to days) and multi-tier origin architecture, the origin collapses under this load.
Cache Shielding Architecture
Production systems use cache shielding where edge CDN nodes route cache misses to regional shield nodes, which then route to origin. This reduces origin request fan-in by 10-100x. Instead of 1,000 edge nodes each sending cache misses to origin, perhaps 10 regional shields aggregate those misses. The origin sees 10 requests instead of 1,000 for the same segment. Shield nodes also absorb thundering herd on new segment publication. Request coalescing at shield nodes groups concurrent requests for the same segment so only one request reaches origin.
Storage and Packaging Efficiency
A 4-hour live event with 6 renditions at 4-second segments generates 900 segments per rendition (3,600 seconds / 4), totaling 5,400 segments. With separate HLS TS and DASH MP4 formats, you store 10,800 distinct files. Using CMAF (Common Media Application Format), you store 5,400 fMP4 chunks and publish both HLS and DASH manifests referencing the same chunks, halving storage and cache footprint. Multi-region packaging with deterministic segment naming enables instant failover: if origin A fails, DNS switches to origin B, and because segment URLs are identical, CDN caches remain valid and players continue without interruption.
Key Insight: Streaming infrastructure scaling is about request rate, not just bandwidth. A 1 million viewer stream may generate 500,000+ RPS for manifests and segments combined. Cache shielding and request coalescing are essential to protect origin servers from this load.

💡 Key Takeaways

✓Bandwidth formula: concurrent viewers times average bitrate (100K at 3 Mbps = 300 Gbps); 2-hour event = 2.7 PB

✓Request rate: 100K viewers with 2s segments generate 100,000+ RPS (segments + manifests); multi-rendition probing adds more

✓Cache shielding reduces origin fan-in by 10-100x by aggregating edge cache misses at regional nodes

✓CMAF halves storage by enabling single fMP4 format for both HLS and DASH; deterministic naming enables failover

📌 Interview Tips

1Calculate bandwidth cost: 1M viewers at 3 Mbps for 2 hours = 2.7 PB = $54,000-$216,000 at CDN rates

2Explain cache shielding: instead of 1,000 edge nodes hitting origin, 10 shields aggregate requests

3Mention deterministic segment naming enabling instant multi-region failover without cache invalidation

← Back to Streaming Protocols (HLS, DASH, RTMP) Overview