Design Fundamentals • CAP TheoremMedium⏱️ ~2 min
Choosing CP: Consistency and Partition Tolerance
Systems that choose CP prioritize correctness over availability during network partitions. When a majority of replicas becomes unreachable, the system refuses to serve some or all requests rather than risk violating linearizability. Google Spanner exemplifies this approach: it uses Paxos based consensus with majority quorums, ensuring that every successful write is visible to subsequent reads in a globally consistent order. If an availability zone fails and majority is lost, writes become unavailable until connectivity restores.
The cost of CP is measurable in both latency and availability. Write operations require at least one round trip to a majority quorum, typically 1 to 5 milliseconds within a region but 50 to 150 milliseconds across continents. Google's F1 system (which powers AdWords) uses Spanner precisely because account balances and bid amounts must remain globally consistent; a stale read could let an advertiser spend more than their budget. The trade off is that cross region writes have p50 latencies around 100ms, acceptable for account updates but too slow for user facing page loads.
ZooKeeper represents another CP choice for coordination primitives. With a typical 5 node ensemble, losing 3 nodes means the service becomes completely unavailable rather than risk split brain scenarios where two leaders might issue conflicting lock grants. This unavailability during majority loss is intentional: locks, leader election, and configuration management require perfect consistency. ZooKeeper achieves tens of thousands of operations per second with 1 to 10ms latencies, but it's designed for low volume coordination, not high throughput data storage.
💡 Key Takeaways
•CP systems reject writes (and sometimes reads) when they cannot reach a majority quorum, sacrificing availability to guarantee linearizability and prevent divergent state
•Write latency includes at least one quorum round trip: 1 to 5ms intra region, 50 to 150ms cross region; Google Spanner adds commit wait (bounded by TrueTime epsilon, typically under 7ms) for external consistency
•Use CP for operations where invariants must hold: financial balances, inventory with strict capacity limits, unique constraints like primary keys, and coordination primitives like locks
•ZooKeeper with 5 nodes becomes unavailable if 3 nodes fail; this is correct behavior for leader election and distributed locks where split brain would cause data corruption
•Google F1 (AdWords) tolerates 100ms p50 write latency for account balance updates because correctness (never overspend budget) outweighs speed; user facing reads use caching
📌 Examples
Google Spanner refuses writes during majority partition; a 3 replica deployment across availability zones becomes unavailable if 2 zones are unreachable, preventing balance inconsistencies
Bank account transfers require CP: if you cannot confirm the debit and credit with a majority, reject the transaction rather than risk double spending or lost money
Primary key uniqueness in distributed databases needs CP; allowing inserts during partition could create duplicate keys that violate constraints when partitions heal
Ticketing systems with strict capacity (100 seats) must use CP to prevent overselling; AP would risk selling seat 100 twice during a partition