Object Storage & Blob StorageMultipart Uploads & Resumable TransfersMedium⏱️ ~2 min

Concurrency, Throughput, and Throttling

Parallel Part Uploads

Parts can upload simultaneously because they are independent. A single TCP connection at 100 Mbps transfers 12.5 MB/s. Four parallel connections quadruple throughput to 50 MB/s assuming sufficient bandwidth. This parallelism is the primary advantage over single PUT beyond retry resilience.

Optimal concurrency depends on bandwidth and latency. High latency links (cross continent uploads) benefit from more parallelism because each connection spends time waiting on round trips. Low latency links saturate with fewer connections. Start with 4 concurrent uploads and adjust based on throughput measurements.

Throughput Optimization

TCP slow start reduces initial throughput. A new connection takes several round trips to reach full speed. For small parts, slow start overhead dominates. Larger parts amortize slow start across more data. Alternatively, reuse connections across parts using HTTP keep alive to skip slow start entirely.

Server side throttling limits per client throughput. Hitting rate limits causes 429 Too Many Requests or 503 Service Unavailable. Implement exponential backoff: wait 1s, 2s, 4s between retries. Add jitter to prevent synchronized retry storms from multiple clients.

Bandwidth Fairness

Multiple uploads from the same client compete for bandwidth. Without coordination, a large background upload can starve interactive uploads. Implement priority queues: interactive uploads get higher concurrency, background uploads throttle during active use.

Server side, per tenant rate limiting prevents one customer from saturating shared infrastructure. Typical limits: 5,000 requests per second, 25 Gbps aggregate throughput per account. Exceeding limits triggers throttling across all that tenant operations.

Progress Reporting

Users expect progress indicators. Track bytes uploaded across all parts. Report percentage as uploaded_bytes / total_bytes. With parallel uploads, progress can jump in chunks as parts complete. Smooth the display by tracking in flight bytes too: a part 80% uploaded contributes 80% of its size to progress.

⚠️ Key Trade-off: More parallelism increases throughput but also increases server load and risk of throttling. Start with 4 concurrent uploads, measure actual throughput, increase only if not hitting rate limits.
💡 Key Takeaways
Parallel part uploads multiply throughput: 4 connections at 100 Mbps yield 50 MB/s aggregate
TCP slow start overhead favors larger parts or connection reuse via HTTP keep alive
Server throttling returns 429 or 503; implement exponential backoff with jitter for retries
Per tenant rate limits (5,000 RPS, 25 Gbps typical) throttle all operations when exceeded
Progress reporting tracks uploaded bytes plus in flight bytes for smooth percentage display
📌 Interview Tips
1Explain why parallelism helps even on fast networks. A 1 Gbps link with 100ms latency only achieves 100 Mbps per connection due to bandwidth delay product. Ten parallel connections saturate the link.
2Describe backoff strategy precisely. On 429 response, wait random(0.5, 1.5) seconds. On next 429, wait random(1, 3) seconds. Cap at 60 seconds. Reset on success.
3For progress reporting, explain the UX problem. With 10 parallel 100MB parts, progress jumps 10% at a time as parts complete. Track bytes in flight to show smoother progress.
← Back to Multipart Uploads & Resumable Transfers Overview
Concurrency, Throughput, and Throttling | Multipart Uploads & Resumable Transfers - System Overflow