Solving Read After Write Consistency with Routing Policies

The Read-After-Write Problem
Read-after-write consistency guarantees that if a user writes data and immediately reads it back, they see their own write. This is surprisingly difficult with asynchronous replicas. Consider a user posting a comment: the write commits to primary at time T0, then at T0 plus 50ms the user refreshes the page.
With replication lag of 10-100ms, there is significant probability the replica has not applied the change. The user sees no comment and files a bug report. This is correct system behavior from the databases perspective—eventual consistency working as designed—but terrible user experience.
Session Pinning
Session pinning routes all requests from a user session to the primary for a configurable window after any write. If a user writes at T0, all their reads until T0 plus 5 seconds go to primary. This guarantees read-after-write consistency within a session without requiring global strong consistency.
Implementation requires tracking write timestamps per session. Store the last-write timestamp in a cookie, session store, or request context. On each read, compare current time to last-write timestamp. If within the pinning window, route to primary; otherwise route to replicas.
Causal Consistency with Tokens
A more precise approach uses causal consistency tokens. When a write completes, the system returns a token representing that writes position in the replication stream (often the Log Sequence Number or LSN—a monotonically increasing identifier for each write operation). Subsequent reads include this token; the system routes to any replica that has applied at least that position.
This avoids unnecessary primary reads: if a replica has caught up past your token, it can serve the read. If no replica is caught up, fall back to primary. The token travels with the request through microservices, maintaining causal ordering across service boundaries.
Cross-Device Challenges
Session-based solutions fail when users switch devices or share URLs. A user posts from their phone, then opens the same page on their laptop—different session, different routing decisions. The laptop session has no knowledge of the phone write and may read from a lagging replica.
Solutions include user-level tracking (tie consistency tokens to user ID, not session), client-side storage (embed last-write tokens in URLs for sharing), or accepting that cross-device scenarios have weaker guarantees. The right choice depends on product requirements and how critical immediate consistency is for the use case.

💡 Key Takeaways

✓Session pinning to primary for 200 to 500 milliseconds after writes guarantees read after write consistency but can push 30 to 50 percent of reads back to primary in write heavy user workflows

✓Freshness tokens (LSN or GTID markers) let routers verify replica freshness per request, minimizing primary load by routing to replicas as soon as they catch up to required positions

✓Production systems at Amazon use freshness token variants in DynamoDB (consistent read API option costs double the read capacity units) and S3 (guarantees read after write for new object PUTs)

✓Explicit consistency knobs let critical user facing paths pay for strong consistency (higher latency, primary load) while background jobs use eventually consistent replicas (lower latency, better throughput)

✓Token propagation requires passing replication positions through entire request chains including caches, queues, and service boundaries, significantly increasing implementation complexity

📌 Interview Tips

1Session pinning: After a user submits a form, your web server sets a sticky cookie with timestamp. For the next 300ms, all read queries from that cookie route to primary. After 300ms, normal replica routing resumes.

2Freshness token: User uploads a profile photo. Primary returns LSN 12500 with the success response. Client includes LSN 12500 in the next GET request. Router checks replicas: Replica1 at LSN 12480 (skip), Replica2 at LSN 12550 (use this one).

3Tiered consistency: Your timeline API accepts a consistency_level parameter. 'strong' routes to primary (5ms p99 latency), 'bounded_staleness' requires replicas within 100ms lag (3ms p99), 'eventual' uses any healthy replica (2ms p99).

← Back to Read Replicas & Query Routing Overview