CAP at Production Scale

CAP Choices Are Per Subsystem
In real production systems handling 100M monthly active users with peak traffic of 200k requests per second, you rarely make one global CAP choice. Instead, you split responsibilities based on correctness requirements and latency targets.

Consider an e-commerce platform like Amazon or Shopify:

Ordering and Payments (CP):
Strong correctness is non negotiable. Double charging users or overselling inventory costs money and trust. These services use CP semantics with leader based replication and majority writes. During partitions, if the primary region cannot confirm with enough replicas, the system rejects new orders rather than risk inconsistent state. Typical configuration: 5 replicas across 3 availability zones, require 3 acknowledgments, write latency p99 of 15ms to 30ms within region.

Product Catalog and Recommendations (AP):
Slight staleness is acceptable. Showing a product that went out of stock 2 seconds ago is fine. These services use AP semantics with eventual consistency. Data replicates across 3 availability zones with 1ms normal RTT. Target p99 read latency is 10ms to 20ms. During a partition isolating 1 zone, all zones continue serving requests. Clients might see slightly different product availability for seconds to minutes until replication catches up.

Session Storage (AP with Tuning):
User sessions stored in something like DynamoDB or Redis. Default mode is eventually consistent reads (AP, 5ms p99 latency). For sensitive operations like checkout, the app can request strongly consistent reads (CP style, 15ms p99 latency, may fail during partitions).

✓ In Practice: Modern systems like DynamoDB, MongoDB, and Cassandra let you choose consistency per operation. You tune based on data importance, not as a system wide setting.

A real world metrics example: Netflix operates with 230M subscribers generating massive view data. Their viewing history and recommendations use AP stores (Cassandra) with eventual consistency, accepting seconds of replication lag for 99.99% availability. But their payment processing and subscription management use CP databases to ensure billing accuracy.

💡 Key Takeaways

✓Production systems make CP versus AP choices per subsystem, not globally. Financial data uses CP, user content feeds use AP.

✓At 200k requests per second peak, even 10ms of added latency from CP consensus costs capacity. You pay for consistency with throughput.

✓Tunable consistency lets you choose per operation: eventually consistent reads at 5ms p99, strongly consistent reads at 15ms p99 with lower availability.

✓Systems like Netflix use AP stores for viewing history (tolerate seconds of lag) but CP for billing (cannot miss or duplicate charges).

📌 Examples

1E-commerce inventory: CP database with 5 replicas, majority writes. During partition isolating 2 replicas, system rejects orders to prevent overselling. Availability drops from 99.99% to 99.9% but maintains correctness.

2Social media feed: AP cache with 3 availability zone replicas. During partition, each zone serves local data. User might miss 5 to 10 recent posts for 30 to 60 seconds until partition heals and anti entropy runs.

← Back to CAP Theorem Overview