Database DesignTransaction Isolation LevelsMedium⏱️ ~3 min

Critical Anomalies: Dirty Reads, Non-Repeatable Reads, and Phantoms

Understanding isolation anomalies is essential for reasoning about correctness bugs in concurrent systems. A dirty read occurs when Transaction A reads uncommitted data from Transaction B, then B rolls back, leaving A with values that never actually existed. This can cause cascading failures: imagine reading an inventory count of 50 units, allocating them, then discovering the count was actually 10 after rollback. A non-repeatable read happens when Transaction A reads a row, Transaction B updates and commits that row, then A reads it again and gets different values. This breaks assumptions in multi-step business logic. For example, a banking transaction reads account balance as 1000 dollars, performs validation checks, then reads again to find 500 dollars because another transaction processed in between. The validation is now invalid but the transaction proceeds. Phantom reads occur when a predicate query (like "SELECT COUNT where status equals pending") returns different result sets on repeated execution due to inserts or deletes by other transactions. A classic failure: check "no overlapping reservations exist for this time slot" returns zero rows, proceed to insert reservation, but another transaction also passed the same check and both reservations conflict. At Repeatable Read without predicate locking, uniqueness checks by range can fail this way. Beyond ANSI anomalies, production systems face write skew (two transactions each read a consistent snapshot, both pass constraints, but their combined effect violates invariants) and lost updates (concurrent read modify write cycles overwrite each other). PostgreSQL Serializable using SSI detects these but at high write contention (over 5,000 updates per second on hot rows), abort rates can exceed 5 to 10 percent. MySQL InnoDB prevents phantoms via next key gap locking but this can drop throughput 2 to 5 times on hot secondary indexes with range queries.
💡 Key Takeaways
Dirty reads cause the most severe corruption because transactions see data that never existed after rollbacks, can cascade through dependent logic
Non-repeatable reads break multi-step validation logic where a transaction assumes row values remain stable across reads within the same transaction
Phantom reads occur at the predicate level (query result sets change) rather than row level, breaking uniqueness checks and aggregate computations
Write skew under snapshot isolation is common: two doctors on call both see each other, both go off call, invariant (at least one on call) breaks despite each transaction passing checks
Lost updates happen at Read Committed when concurrent read modify write cycles overwrite changes, mitigate with optimistic locking or compare and set semantics
PostgreSQL SSI at over 5,000 updates per second on hot counters can see 5 to 10 percent abort rates, requiring application retry logic and idempotent operations
📌 Examples
Banking: Transaction reads balance 1000 dollars for validation, concurrent withdrawal commits 500 dollars, validation now invalid but transaction proceeds with stale value
Reservation system: Two transactions check "no overlapping bookings for 2 PM to 3 PM" both see empty result, both insert bookings, double booking occurs
Inventory allocation: Read stock count 50, allocate 50 units, but count was updated to 10 by concurrent transaction, oversold by 40 units
MySQL InnoDB with range scan on secondary index (SELECT * FROM posts WHERE user_id BETWEEN 1000 AND 2000) acquires gap locks, serializing concurrent inserts and dropping write throughput 2 to 5 times
← Back to Transaction Isolation Levels Overview