Replication & Consistency • Leader-Follower ReplicationMedium⏱️ ~3 min
Commit Semantics: Synchronous vs Asynchronous vs Semi-Synchronous Replication
The acknowledgment policy for replication determines the fundamental tradeoff between write latency and durability, measured as Recovery Point Objective (RPO), or how much data can be lost on failover. With synchronous replication, the leader waits for acknowledgments from one or more followers before confirming the write to the client. This increases commit latency by at least one network Round Trip Time (RTT) to the synchronous follower, typically 1 to 5 milliseconds within a datacenter or 50 to 150 milliseconds across regions. The benefit is near zero RPO relative to those followers, since the data exists on multiple nodes before the client receives confirmation. Google Spanner uses this model with Paxos groups where writes commit after a majority of replicas durably log the entry, resulting in 10 to 20 milliseconds commit latency intra region and 50 to 150 plus milliseconds for multi region configurations.
Asynchronous replication provides the opposite tradeoff: the leader acknowledges immediately after local durability without waiting for any followers. Commit latency is minimal, often under 1 millisecond for in memory acknowledgment, but recent writes can be lost if the leader fails before followers replicate them. The RPO equals the replication lag at failure time, which can be seconds in a stressed system. Many production databases default to asynchronous replication for performance. Oracle Data Guard recommends asynchronous mode over Wide Area Networks (WANs) with tens to hundreds of milliseconds RTT to avoid unacceptable commit penalties, accepting RPO targets of sub second to few seconds based on how aggressively the standby applies logs.
Semi synchronous replication balances these extremes by requiring at least one synchronous follower for durability while keeping additional replicas asynchronous for read scaling and geographic distribution. This is common in production: MySQL semi sync waits for one replica acknowledgment, Microsoft SQL Server Always On can configure one synchronous secondary for automatic failover within a region while maintaining asynchronous secondaries in remote regions for disaster recovery. The result is zero or near zero data loss within the synchronous fault domain with minimal commit latency penalty (one RTT to nearest follower), while still providing read replicas globally. The key metric to watch is the p99 commit latency: if your Service Level Objective (SLO) requires sub 10 millisecond writes, synchronous replication to remote regions will violate it, but metro area semi sync typically fits within budget.
💡 Key Takeaways
•Synchronous replication adds one RTT to commit latency (1 to 5 ms LAN, 50 to 150 ms cross region) but achieves RPO near zero. Google Spanner commits after majority acknowledge, seeing 10 to 20 ms intra region latency.
•Asynchronous replication minimizes write latency to under 1 ms but accepts RPO equal to replication lag at failure, often 1 to 5 seconds in healthy systems, potentially more under load. Oracle Data Guard uses async over WAN to avoid large commit penalties.
•Semi synchronous requires one synchronous replica for zero data loss with one RTT penalty, plus async replicas for scale. MySQL semi sync and SQL Server Always On use this pattern for intra region durability with global read replicas.
•Majority or quorum replication (as in Raft, Paxos groups, or MongoDB majority write concern) commits when majority have logged entry, protecting against any minority of failures. Kafka with acks=all waits for all In Sync Replicas (ISR) to acknowledge.
•Commit latency equals local persist time plus kth fastest follower RTT plus follower apply time, where k depends on policy. p99 latency often dominates user experience; monitor it closely against SLO targets.
•Choosing policy requires explicit RPO and Recovery Time Objective (RTO) targets. Systems with zero data loss requirements (financial transactions, inventory) must use synchronous or majority commit despite latency cost. Read heavy workloads with relaxed consistency can use async for maximum throughput.
📌 Examples
Google Spanner multi region configuration: Write to leader in us central1 must achieve majority across 5 replicas in us central1, us east1, us west1. If majority includes a replica 2000 km away with 40 ms RTT, commit latency floor is approximately 40 ms plus local persist time, explaining observed 50 to 100 ms p50 latencies for multi region instances.
MySQL with rpl_semi_sync_master_timeout set to 1000 ms: Master waits up to 1 second for at least one semi sync replica acknowledgment. If timeout expires (replica too slow or failed), master falls back to async mode and logs warning. This prevents writes from blocking indefinitely while sacrificing zero RPO guarantee during degradation.
MongoDB replica set with writeConcern set to JSON object { w: 'majority', wtimeout: 5000 }: Primary waits for majority of voting members to acknowledge within 5 second timeout. If majority unreachable, write fails with timeout error rather than acknowledging with uncertain durability. Typical majority acknowledge time is 5 to 20 ms within same region with healthy replicas.