Object Storage & Blob StorageMultipart Uploads & Resumable TransfersMedium⏱️ ~2 min

Concurrency, Throughput, and Throttling

High concurrency is the primary lever for maximizing upload throughput, but it introduces risks of server throttling, client memory pressure, and network unfairness. Production systems carefully tune concurrency based on link capacity, server limits, and observed error rates. Common production defaults cap concurrency at 8 to 32 concurrent part uploads per file. With 128 MB parts and 16 way parallelism on a stable 10 Gbps link, you can achieve multi-gigabit single object throughput. Amazon S3 scales to handle thousands of requests per second per prefix, but aggressive clients uploading hundreds of parts simultaneously risk HTTP 429 (rate limit) or 503 (service unavailable) responses. When throttled, clients must implement exponential backoff with jitter and dynamically reduce concurrency. Adaptive concurrency control is critical for production reliability. Start with a moderate concurrency level (for example, 8 workers). If success rates remain high and latency is low, incrementally increase to 16, then 32. On encountering 429 or 503 errors, immediately halve concurrency and apply exponential backoff before retrying. This prevents cascading failures and respects server capacity limits. Memory pressure is another concern: uploading 32 parts of 256 MB each holds 8 GB in flight. Clients must use bounded buffers and streaming reads from disk to avoid exhausting memory, especially on resource constrained mobile devices or serverless functions with fixed memory limits.
💡 Key Takeaways
Production defaults: 8 to 32 concurrent part uploads per file to balance throughput and throttling risk
Throughput example: 16 concurrent 128 MB parts on 10 Gbps link can achieve multiple gigabits per second for single object upload
Adaptive concurrency: Start at 8 workers, increase on success, halve on 429 or 503 errors with exponential backoff and jitter
Memory pressure: 32 parts × 256 MB = 8 GB in flight; use bounded buffers and streaming disk reads to avoid exhausting client memory
Amazon S3 scales to thousands of requests per second per prefix but aggressive parallelism from single clients triggers rate limiting
📌 Examples
AWS SDK default: 10 concurrent part uploads with automatic retry and exponential backoff for S3 multipart transfers
Google Cloud Storage client libraries: Start with 8 parallel streams, monitor 503 responses, back off and reduce concurrency dynamically
Browser upload tool (Uppy) with S3: Configurable concurrency limit (default 5 to 10 parts) to avoid overwhelming client device or hitting rate limits
← Back to Multipart Uploads & Resumable Transfers Overview
Concurrency, Throughput, and Throttling | Multipart Uploads & Resumable Transfers - System Overflow