Data Partitioning and Replication in Key Value Stores

Why Partition Data
A single server has finite memory, CPU, and network capacity. To scale beyond one machine, distribute data across multiple nodes. Each node handles a subset of keys, allowing the cluster to grow by adding nodes. The challenge is deciding which keys go where and keeping data available when nodes fail.
Hash Partitioning
The most common approach: hash the key and assign it to a partition based on the hash value. Simple modulo sharding (hash(key) % num_nodes) works but causes massive data movement when cluster size changes. Adding one node to a 100-node cluster remaps nearly all keys.
Consistent Hashing
Consistent hashing minimizes data movement during cluster changes. Conceptually, keys and nodes map to positions on a circular ring. Each key routes to the first node clockwise from its position. Adding a node only moves keys between that node and its neighbor, typically 1/N of total keys rather than nearly all. Virtual nodes (multiple positions per physical node, typically 100-256) spread load more evenly.
Replication for Availability
Each partition replicates to multiple nodes for fault tolerance. Common configurations replicate to 3 nodes: one primary handles writes and reads, two replicas stand by for failover. When the primary fails, a replica promotes automatically.
Replication Modes
Synchronous replication waits for all replicas to acknowledge before confirming writes. This guarantees durability (data survives any single failure) but adds latency, typically 10-50ms extra per write. Asynchronous replication acknowledges after the primary writes, risking data loss if the primary fails before replicating. Most systems let you choose per-operation based on durability requirements.

💡 Key Takeaways

✓Partitioning distributes data across nodes for horizontal scaling beyond single-machine limits

✓Consistent hashing moves only 1/N keys when adding a node versus nearly all with modulo sharding

✓Virtual nodes (100-256 per physical node) improve load distribution across the hash ring

✓Replication factor of 3 is standard: one primary plus two replicas for failover

✓Synchronous replication guarantees durability but adds 10-50ms latency

📌 Interview Tips

1Explain the problem consistent hashing solves: adding nodes to modulo sharding requires remapping most keys

2Draw the hash ring concept when discussing partitioning to visualize how keys map to nodes

3Discuss the replication trade-off: synchronous for durability critical data, async for speed

← Back to Key-Value Stores (Redis, DynamoDB) Overview