Little's Law and the Latency-Concurrency-Throughput Triangle
Concurrency = Throughput × Latency. When latency rises, you need proportionally more concurrent connections to maintain throughput. This explains why systems collapse under load.THE MATH IN ACTION
API serves 10,000 RPS with 50ms latency. Concurrency = 10,000 × 0.05 = 500. You need 500 concurrent connections to sustain this throughput. Thread pool, connection pool, and memory must handle 500 simultaneous operations.
THE DEATH SPIRAL
Database slows down. Latency jumps 50ms to 200ms. To maintain 10,000 RPS, concurrency must rise to 2,000. Thread pool caps at 1,000. Requests queue. Queue time adds latency, pushing to 500ms. Now need 5,000 slots. More queuing. System spirals into failure.
PLANNING WITH THE FORMULA
Target: 20,000 RPS at 100ms p95. Required concurrency = 20,000 × 0.1 = 2,000. Add 30% headroom: provision for 2,600 concurrent connections. Size pools accordingly.
Each request uses 50KB memory? 2,600 concurrent requests need 130MB just for request data, plus overhead.
SERVICE TIME CEILING
Each request needs 10ms CPU on 8 cores? Max is 800 RPS (8 × 100). More threads do not help when CPU bound. Solutions: optimize code or add servers.