Database Design • Document Databases (MongoDB, Firestore)Medium⏱️ ~3 min
Sharding and Shard Key Selection at Scale
Document databases distribute data horizontally by sharding on a chosen key. The shard key determines which shard stores each document, how queries route across the cluster, and whether load distributes evenly or concentrates on hot partitions. A poorly chosen shard key can bottleneck an entire system, while the right choice enables linear scaling to millions of operations per second.
High cardinality is essential: a shard key with few distinct values (like boolean status or country with 50 values) limits the number of effective shards and creates large, unbalanced chunks. Monotonic keys like timestamps or sequential IDs concentrate all recent writes on a single shard (the one handling the latest range), creating a write hotspot while other shards sit idle. For example, using orderTimestamp as the shard key means all new orders hit one shard until the balancer splits and migrates chunks, causing p99 latency spikes from 10ms to 500ms during peak traffic.
The solution is hashing or composite keys. Hashing the timestamp or ID distributes writes evenly but loses range query efficiency (you cannot efficiently query "all orders in the last hour" because they scatter across shards). A composite key like (tenantId, hashedOrderId) keeps a tenant's data together for efficient tenant-scoped queries while spreading each tenant's writes via the hash. MongoDB recommends testing shard key distribution with realistic data: a celebrity user or viral product can still create a hot partition if the key does not account for skew.
Resharding is complex and disruptive. MongoDB 5.0+ supports live resharding, but it still involves chunk migration and increased resource usage. Choose the shard key carefully upfront, considering both current query patterns and future growth. Uber shards ride data by geographic region hashes to balance load across regions while keeping regional queries efficient.
💡 Key Takeaways
•Shard key must have high cardinality: using status with 3 values limits system to 3 effective shards regardless of cluster size, wasting capacity and creating imbalance
•Monotonic keys (timestamps, sequential IDs) create write hotspots where 100% of inserts hit one shard, causing p99 latency spikes from 10ms to 500ms while other shards idle
•Hashed shard keys distribute writes evenly (each shard gets ~33% load) but lose range query efficiency: querying last hour requires scatter-gather across all shards
•Composite keys like (tenantId, hashedId) enable targeted queries (all tenantId data on few shards) while spreading load via hash component, critical for multi-tenant systems
•Celebrity or viral entity problem: even with good shard key, a single user with millions of followers can create hot partition if not factored into key design with additional distribution
•Resharding is disruptive: MongoDB live resharding works but involves chunk migration consuming I/O and CPU, plan shard key for 3-5 year growth to avoid operational pain
📌 Examples
Bad: orderTimestamp as shard key causes all new orders to hit Shard N (latest time range), 0 QPS on other shards, 50K QPS bottleneck on one shard
Good: hash(orderId) distributes writes evenly at 15K QPS per shard across 4 shards (60K total QPS), but query for orders in last hour scatters to all 4 shards
Uber rides: shard on (regionHash, rideId) keeps regional queries efficient (query San Francisco rides hits 2 shards) while spreading regional writes via rideId hash
MongoDB example: sh.shardCollection("ecommerce.orders", { customerId: "hashed" }) spreads customer orders evenly but requires full cluster scan for time range queries