Database DesignWide-Column Stores (Cassandra, HBase)Hard⏱️ ~3 min

Availability Prioritized (AP) vs Consistency Prioritized (CP) Wide-Column Systems

Availability Prioritized (AP) systems like Cassandra remain available under network partitions by allowing all nodes to accept writes independently, using eventual consistency with tunable per operation guarantees. You configure replication factor (typically 3) and choose read/write consistency levels per request. Setting write level QUORUM and read level QUORUM where read plus write exceeds replication factor (2 + 2 > 3) ensures immediate consistency per key. Local quorum in multi datacenter deployments keeps latency low (single digit milliseconds median) while tolerating datacenter failures. The tradeoff is no multi row transactions and potential for last write wins conflicts requiring tight Network Time Protocol (NTP) discipline with clock skew under 100 ms. Consistency Prioritized (CP) systems like HBase provide linearizable per row operations by routing all mutations for a region to a single leader. The leader serializes writes and serves strongly consistent reads, simplifying application logic for inbox ordering or counter semantics. Facebook Messages uses this model at multi petabyte scale with billions of writes per day, combining cache hits in a few milliseconds with HBase p99 random reads in tens of milliseconds. The tradeoff is reduced availability under partitions (writes stall if the leader is unreachable) and operational complexity from master coordination via ZooKeeper. Choose AP when you need always on operation across datacenter failures and can design around eventual consistency or per key quorum semantics. Netflix runs hundreds of AP clusters across multiple regions handling trillions of operations per day with sub 20 ms p99 latency. Choose CP when per row correctness is paramount and you benefit from integration with Hadoop Distributed File System (HDFS) for batch analytics, accepting master coordination overhead and partition unavailability.
💡 Key Takeaways
AP systems achieve always on writes by accepting operations at any replica with tunable consistency (LOCAL_QUORUM gives 5 to 10 ms p99), staying available under datacenter failures but risking last write wins conflicts
CP systems route all writes for a partition to a single leader providing linearizable per row reads and writes, simplifying correctness but losing availability when the leader is unreachable
Tunable consistency in AP requires read level plus write level exceeding replication factor for immediate consistency (QUORUM read + QUORUM write with replication factor 3 gives 2 + 2 > 3)
Clock skew in AP systems with last write wins can cause lost updates, requiring tight NTP synchronization (under 100 ms drift) and idempotent write design
CP master coordination via ZooKeeper adds operational complexity with risks of assignment stalls during coordination outages or network instability
Netflix uses AP for trillions of operations per day across regions with sub 20 ms p99, while Facebook uses CP for billions of inbox writes per day requiring strong per thread ordering
📌 Examples
AP multi datacenter write: Client writes to local datacenter with LOCAL_QUORUM (2 of 3 local replicas acknowledge in 5 ms), async replication to remote datacenters completes in 50 to 200 ms depending on Wide Area Network (WAN) latency. System stays available if one datacenter fails.
CP region server failure: HBase region containing user inbox is on failed server. Master detects failure via ZooKeeper heartbeat timeout (30 seconds default), reassigns region to healthy server (10 to 60 seconds), replays Write Ahead Log (WAL) from HDFS (5 to 30 seconds). Total unavailability 45 to 120 seconds.
AP consistency anomaly: Two datacenters receive concurrent writes for same key with timestamps T1=10:00:00.100 and T2=10:00:00.050 due to clock skew. Last write wins chooses T1 as newer, but logically T2 happened after T1 at the application level.
← Back to Wide-Column Stores (Cassandra, HBase) Overview
Availability Prioritized (AP) vs Consistency Prioritized (CP) Wide-Column Systems | Wide-Column Stores (Cassandra, HBase) - System Overflow