Object Storage & Blob StorageBlock vs Object vs File StorageMedium⏱️ ~3 min

Object Storage at Scale: Durability, Key Distribution, and Performance Patterns

How Object Storage Achieves Durability

Object storage systems guarantee 99.999999999% (eleven nines) annual durability by distributing data across multiple failure domains. An object splits into chunks using erasure coding (a technique that generates redundant fragments so any subset can reconstruct the original). A 10+4 scheme splits data into 10 data chunks plus 4 parity chunks, storing each on a different server. Losing any 4 servers still allows complete reconstruction.

Durability differs from availability. Eleven nines durability means losing 1 object per 10 billion per year. Availability might be 99.9%, meaning 8.7 hours of potential downtime annually when reads might fail. The data survives outages but might be temporarily inaccessible.

Key Distribution and Scalability

Object storage uses consistent hashing (mapping keys to ring positions where servers own ranges) to distribute objects across servers without central coordination. Adding a server shifts only adjacent keys. A well designed key scheme avoids hot partitions: timestamped keys like 2024/01/15/event123.json route all new writes to the same partition. Prefixing with a hash distributes writes: a3f2/2024/01/15/event123.json.

Throughput scales horizontally. Adding 10 servers adds 10x write capacity because objects are independent. There is no metadata server bottleneck like file systems have. A million servers can each handle their partition with no coordination for writes.

Performance Patterns and Expectations

First byte latency runs 50-100ms for object storage versus 1-2ms for block storage. This latency comes from the distributed lookup: finding which servers hold the object chunks, initiating parallel reads, waiting for enough chunks to reconstruct. Throughput for large objects is excellent, often 100+ MB/s per object using parallel chunk downloads.

Small object performance suffers. Reading 1,000 separate 1KB objects takes 1,000 round trips at 100ms each. Batch APIs mitigate this somewhat, but object storage fundamentally optimizes for large objects. Pack small files into archives or use a database for metadata with object storage for blobs.

Request Pricing Model

Object storage charges per request in addition to capacity. PUT costs $5 per million requests, GET costs $0.40 per million approximately. Storing 1 million small files costs similar to storing one large file, but accessing all million costs 1 million times more. This pricing incentivizes batching and larger objects.

⚠️ Key Trade-off: Eleven nines durability comes from erasure coding across failure domains, but adds 50-100ms lookup latency. Object storage optimizes for never losing data, not for fast random access.
💡 Key Takeaways
Eleven nines (99.999999999%) durability through erasure coding: 10+4 scheme survives loss of any 4 servers
Durability differs from availability: data survives but may be temporarily inaccessible during outages
Consistent hashing distributes objects without central coordination, enabling horizontal scaling to millions of servers
First byte latency of 50-100ms versus 1-2ms for block storage due to distributed chunk lookup
Per request pricing ($5 per million PUTs, $0.40 per million GETs) incentivizes larger objects and batching
📌 Interview Tips
1When discussing object storage scale, explain why it scales where file systems cannot: no shared metadata server. Each object is independent, so each server handles its partition with zero coordination.
2Address the small file antipattern explicitly. Storing millions of 1KB files in object storage works but costs 1000x more to access than one 1GB file. Suggest bundling into archives.
3If asked about consistency, explain that object storage provides read after write consistency for new objects but may have eventual consistency for overwrites. Versioning helps: old version remains until new version fully propagates.
← Back to Block vs Object vs File Storage Overview