Production Scale Patterns and Traffic Offloading Economics
Traffic Offloading Economics
Proxying data through application servers consumes bandwidth, memory, and CPU. A server handling 100 concurrent 10MB downloads needs 1GB of memory for buffering, significant network bandwidth, and CPU for encryption. Presigned URLs reduce this to generating 200 byte URLs.
Cost comparison: cloud egress costs $0.05-0.12 per GB. Compute for proxying adds another $0.01-0.03 per GB. Storage egress via presigned URL costs the same $0.05-0.12 but zero compute. At petabyte scale, this saves thousands monthly.
CDN Integration
Presigned URLs work with CDNs (Content Delivery Networks, geographically distributed cache servers). Generate a presigned URL, user requests it, CDN fetches from origin using the signed URL, caches the response. Subsequent requests hit CDN cache without touching origin.
The challenge: cached content needs the same URL. If each user gets a unique presigned URL, nothing caches. Solution: generate URLs with longer expiration, include only object specific parameters (not user specific). Or use CDN signed URLs instead, letting the CDN handle authorization.
Batch URL Generation
Generating presigned URLs is CPU bound: signature computation. A server can generate 10,000-50,000 URLs per second per core. For pages displaying 100 thumbnails, generate all URLs in one batch. Avoid N+1 patterns where each image triggers a separate URL generation.
Pre generate URLs for predictable access patterns. A photo gallery with 1000 images: generate all URLs when user opens gallery, not when each image scrolls into view. Trade slightly earlier expiration for better perceived performance.
Multi Region Considerations
Presigned URLs contain the storage endpoint. A URL for US East storage does not work for EU West storage. For multi region architectures, generate URLs pointing to the nearest replica. This requires knowing user location and which region holds their data.
Cross region replication adds complexity. If data replicates with delay, a URL for a just written object might hit a replica before replication completes. Use read after write consistent regions or delay URL generation until replication confirms.