Design FundamentalsLatency vs ThroughputMedium⏱️ ~3 min

Real World Latency and Throughput Numbers Every Engineer Should Know

Here is the latency table every engineer should memorize:

OperationLatency (ns)
L1 cache reference0.5 ns
L2 cache reference3 ns
Branch mispredict5 ns
Mutex lock/unlock (uncontended)15 ns
Main memory reference50 ns
Compress 1K bytes with Snappy1,000 ns
Read 4KB from SSD20,000 ns
Round trip within same datacenter50,000 ns
Read 1MB sequentially from memory64,000 ns
Read 1MB over 100 Gbps network100,000 ns
Read 1MB from SSD1,000,000 ns
Disk seek5,000,000 ns
Read 1MB sequentially from disk10,000,000 ns
Send packet CA to Netherlands to CA150,000,000 ns

Key Patterns to Notice

1000x jumps: Memory to SSD is 1,000x slower. SSD to disk is another 10x. Network round trip across continents is 3,000x slower than same datacenter.

Sequential vs random: Reading 1MB sequentially from SSD (1ms) is 50x faster than random 4KB reads at the same total data.

Key Insight: Memorize these orders of magnitude. In interviews, being able to say "cross-region is about 150ms round trip" or "SSD random read is 20 microseconds" demonstrates practical systems knowledge.
💡 Key Takeaways
RAM access (100ns) to SSD (100μs) is 1000x slower; SSD to spinning disk is another 100x; this hierarchy drives all caching decisions
Cross-region latency is 40 to 70ms, cross-continent 80 to 120ms; speed of light sets hard physical limits around 40ms US coast to coast
Cache hit rate dramatically affects average latency: 95% hits = 3.45ms average, 99% hits = 1.49ms average when cache is 1ms and database is 50ms
Single server benchmarks: web app 1k to 10k RPS, database 10k to 50k simple queries/sec, Redis 100k+ ops/sec
📌 Interview Tips
1Quote specific latency numbers in design discussions; saying 'Redis adds about 1ms' or 'cross-region is 50ms' shows you understand real constraints
2Use throughput numbers for capacity planning; if you need 100k RPS, explain why you need 10+ application servers or load balancing
3When calculating cache benefit, show the math: 95% hit rate with 1ms cache and 50ms database yields 3.45ms average
← Back to Latency vs Throughput Overview