What Are Read Replicas and Why Do They Matter?

What Are Read Replicas
Read replicas are read-only copies of your primary database that stay synchronized through replication. When a write commits on the primary, the change streams to replicas with some delay called replication lag. This creates multiple copies of your data that can serve read queries, distributing load across many servers instead of concentrating it on one.
The fundamental tradeoff: replicas increase read capacity but introduce complexity around consistency. Data on replicas is always slightly behind the primary. Applications must understand and handle this staleness, or users encounter confusing behavior where data appears and disappears.
Replication Mechanisms
Asynchronous replication streams committed transactions from primary to replicas without waiting for acknowledgment. The primary commits immediately, replicas apply changes as fast as they can. This minimizes write latency but means replicas lag behind by milliseconds to seconds depending on write volume and network conditions.
Synchronous replication requires at least one replica to acknowledge before the primary commits. This guarantees the replica has the data but adds network round-trip latency to every write. Semi-synchronous modes wait for acknowledgment but proceed after timeout, balancing durability with availability.
Replication Lag Characteristics
Replication lag is measured as the time difference between when a transaction commits on the primary and when the replica applies it. Typical lag ranges from 10-100ms under normal conditions but can spike to seconds during write bursts, long transactions, or network issues.
Lag accumulates when replicas cannot apply changes as fast as the primary produces them. Large transactions, schema changes, and bulk operations are common causes. Monitoring lag is critical: a replica 5 seconds behind serves data that may confuse users expecting recent updates.
Read Scaling Benefits
Adding replicas linearly increases read throughput. If your primary handles 10,000 reads per second and you add 3 replicas, total capacity becomes 40,000 reads per second. This scales horizontally without changing application code significantly, just routing decisions.
Geographic distribution places replicas closer to users in different regions, reducing read latency. A user in Europe reads from a European replica instead of crossing the Atlantic to a US primary. Write latency still requires reaching the primary, but reads—typically the majority of traffic—become local.

💡 Key Takeaways

✓Replication lag in same region deployments averages 10 to 100 milliseconds, while cross region lag ranges from 100 to 1000 milliseconds due to network distance and batching

✓Production systems commonly deploy 3 to 10 replicas per shard to handle read to write ratios of 5:1 to 20:1 in social, content, and commerce workloads

✓Amazon Aurora supports up to 15 read replicas with single to double digit millisecond lag within a region by sharing a distributed storage layer across all instances

✓Each replica adds 10 to 30 percent overhead on the primary database due to replication fanout, log generation, and network bandwidth consumption

✓Cross region replicas incur data egress costs on every write (typically $0.09 per GB transferred) plus full instance compute costs, often totaling 1 to 3 times the primary cost

📌 Interview Tips

1A social media feed service handles 50,000 reads per second and 5,000 writes per second. With 5 read replicas, each replica serves approximately 10,000 read QPS while the primary handles 5,000 writes plus some reads that require strict consistency.

2Amazon RDS supports up to 15 read replicas per primary instance, distributable across availability zones or regions. Multi Availability Zone (AZ) deployments use synchronous replication for high availability, not read scaling.

3Aurora Global Database maintains typical cross region replication lag under 1 second in steady state, enabling geo local reads with bounded staleness for international user bases.

← Back to Read Replicas & Query Routing Overview