Failure Modes: Orphaned Uploads, Part Limits, and Integrity Drift
Orphaned Uploads
An upload initiated but never completed leaves parts on the server consuming storage. Common causes: client crashes and never resumes, user cancels but client does not call abort, network disconnects permanently. At scale, orphaned uploads accumulate significant storage cost.
Mitigation requires both client and server cooperation. Clients should call abort upload when canceling. Servers implement lifecycle policies deleting incomplete uploads after 7-30 days. Monitor orphaned upload count and storage size. If growing unexpectedly, investigate client side abort handling.
Part Limit Exhaustion
Reaching 10,000 parts prevents adding more. For a 5MB minimum part size, this caps files at 50GB. Larger files require larger parts, but part size is fixed at upload initiation. If you underestimate file size and choose too small parts, the upload fails when hitting the limit.
Prevention: calculate minimum part size before initiating. If file size might grow (streaming upload), use conservative large parts. If exact size is unknown, estimate high. Hitting the limit mid upload requires aborting and restarting with larger parts.
Integrity Drift
The source file changes while parts are uploading. Part 1 comes from file version A, part 50 comes from file version B. The assembled object is corrupted: half old data, half new. Neither checksum validation during upload nor server side integrity checks catch this because each part is individually valid.
Prevention requires file locking or change detection. Lock the file during upload to prevent modification. Or compute file hash before starting, verify unchanged before completing. If changed, abort and notify user. For generated data, include sequence numbers or hashes that allow detecting version skew during assembly.
ETag Mismatch at Completion
The complete request includes part numbers and their ETags. If any ETag does not match what the server has, completion fails. Causes: client recorded wrong ETag, part was uploaded twice with different content (file changed), server error. The client must query parts list from server to reconcile.
This is actually a safety feature: it prevents corrupted assembly. Debug by comparing client recorded ETags with server list parts response.