Replication & Consistency • Leader-Follower ReplicationHard⏱️ ~3 min
Geographic Distribution Trade-offs and Cross Region Replication Latency
Geographic distribution of leader follower replicas serves two primary goals: disaster recovery and read latency reduction for global users. However, these goals create a fundamental tension with write latency. A follower placed across a continent or ocean provides resilience against regional failures and can serve reads with 50 to 150 milliseconds lower latency to nearby users, but if that follower is synchronous, every write must now wait for an acknowledgment across that Wide Area Network (WAN) Round Trip Time (RTT), inflating commit latency from single digit milliseconds to 50, 100, or even 150 plus milliseconds depending on distance. For example, RTT between us east (Virginia) and eu west (Ireland) is approximately 80 milliseconds, us west (Oregon) to ap southeast (Sydney) is approximately 140 milliseconds. If your Service Level Objective (SLO) requires p99 write latency under 20 milliseconds, synchronous cross region replication will violate it.
The standard production pattern is to keep synchronous replicas within a single region or metro area (where RTT is typically 1 to 5 milliseconds) to achieve Recovery Point Objective (RPO) near zero within that region, and place additional asynchronous replicas in remote regions for disaster recovery and read locality. This accepts an RPO greater than zero for regional failures equal to the replication lag at failure time, which is usually targeted at sub second to a few seconds for disaster recovery scenarios. Oracle Data Guard explicitly recommends synchronous Maximum Protection or Maximum Availability modes only for metro distances and asynchronous mode for cross region to avoid unacceptable commit latency. Microsoft SQL Server Always On follows the same pattern: synchronous secondary within primary datacenter or metro for automatic failover, asynchronous secondary in disaster recovery region.
Google Spanner offers an alternative for applications that require zero data loss even across regions: multi region configurations with synchronous majority commit spanning regions. Spanner places replicas in multiple regions (for example 5 replicas across us central1, us east1, us west1) and commits writes when a majority (3 of 5) have durably logged. This guarantees no data loss even if an entire region fails, but commit latency reflects the distance to the majority: typically 50 to 100 plus milliseconds for multi region instances. Spanner can provide this with external consistency (stronger than linearizability across the system) via TrueTime, making it suitable for globally distributed transactional workloads that can tolerate the write latency cost. For applications that cannot, alternative architectures include multi leader replication with conflict resolution (accepting eventual consistency) or application level sharding by region with cross region coordination only when necessary.
💡 Key Takeaways
•Cross region RTTs typically range from 50 ms (nearby regions like us east to us west) to 150 ms plus (intercontinental like us to Australia). Synchronous replication to remote follower adds this RTT to every commit, violating most low latency SLOs.
•Standard production pattern: synchronous replicas within region (1 to 5 ms RTT) for RPO zero, asynchronous replicas across regions for disaster recovery (RPO equals replication lag, targeted at sub second to few seconds). Oracle and Microsoft databases recommend this topology.
•Google Spanner multi region configurations commit when majority of replicas across regions acknowledge, accepting 50 to 100 plus ms commit latency for zero RPO across regional failures. Suitable for globally consistent transactions where write latency is acceptable cost.
•Read latency benefit is substantial: serving reads from regional followers reduces latency by 50 to 150 ms compared to reading from leader in remote region. Critical for user facing applications with global users. Followers must handle staleness or use consistency mechanisms.
•Replication bandwidth and cost: Cross region data transfer is expensive (typically 0.02 to 0.12 dollars per GB depending on provider and regions). High write throughput systems can incur thousands to tens of thousands of dollars per month in transfer costs for multi region replication.
•Alternative patterns for global writes: Multi leader replication with Conflict Free Replicated Data Types (CRDTs) or last write wins accepts eventual consistency to allow local writes in each region. Application level sharding by region (users in Europe write to EU leader, Asia to Asia leader) avoids cross region sync but limits global transactions.
📌 Examples
Netflix Cassandra multi region: Netflix runs Cassandra clusters across AWS regions (us east, us west, eu west) with asynchronous replication between regions. Each region serves local reads and writes with eventual consistency across regions. Write latency stays under 10 ms local, avoiding cross region penalty. Replication lag between regions monitored and typically kept under 1 second. Accepts potential conflict resolution and last write wins semantics.
PostgreSQL with Patroni and streaming replication: Primary in us east 1a, synchronous standby in us east 1b (same region, different availability zone), asynchronous standby in eu west 1. Synchronous commit waits for us east 1b acknowledgment (2 to 5 ms RTT), achieving RPO zero for availability zone failure. EU standby trails by 500 ms to 2 seconds under normal load, providing disaster recovery with non zero RPO. Total commit latency p99 approximately 10 ms dominated by disk fsync and synchronous replication.
Azure Cosmos DB multi region writes: Cosmos DB supports multi leader (multi region writes) with configurable consistency levels. Strong consistency across regions incurs cross region RTT on every write (violates low latency SLO). Bounded staleness allows configurable lag (for example 100,000 operations or 5 minutes), session consistency provides read your writes within session. Typical deployment uses session or eventual consistency for low latency, accepting conflict resolution via last write wins or custom merge policies.