Database DesignTransaction Isolation LevelsMedium⏱️ ~3 min

Critical Anomalies: Dirty Reads, Non-Repeatable Reads, and Phantoms

Why Anomalies Matter

Isolation levels exist to prevent specific failure modes called anomalies. When concurrent transactions interact without proper isolation, they can produce incorrect results even though each transaction individually follows the rules. Understanding these anomalies helps you choose the right isolation level for your workload.

Dirty Read

A transaction reads data written by another transaction that has not yet committed. If that transaction rolls back, the reader used values that never actually existed. Example: Transaction A reads inventory count of 50 from uncommitted Transaction B, allocates those units, then B rolls back revealing count was 10. A oversold by 40 units. Prevented at Read Committed and above.

Non-Repeatable Read

A transaction reads the same row twice and gets different values because another transaction modified and committed between reads. Example: Banking transaction reads balance as $1,000, validates withdrawal, another transaction debits $600, then A re-reads seeing $400. Validation used stale data. Prevented at Repeatable Read and above.

Phantom Read

A transaction executes a query twice and gets different sets of rows because another transaction inserted or deleted matching rows. Unlike non-repeatable reads (existing rows change), phantoms involve rows appearing or disappearing. Example: A checks for overlapping reservations, finds none, inserts. Meanwhile B does the same. Both pass validation but create double-booking. Only Serializable fully prevents phantoms, though some databases use gap locking (locking range between existing records) at Repeatable Read.

The Severity Hierarchy

Dirty reads are most dangerous (reading uncommitted garbage). Non-repeatable reads are less severe (data was at least committed). Phantoms are subtlest (query results change due to new rows). Higher isolation levels prevent more anomalies but cost more in performance or concurrency.

💡 Key Takeaways
Dirty reads cause the most severe corruption because transactions see data that never existed after rollbacks, can cascade through dependent logic
Non-repeatable reads break multi-step validation logic where a transaction assumes row values remain stable across reads within the same transaction
Phantom reads occur at the predicate level (query result sets change) rather than row level, breaking uniqueness checks and aggregate computations
Write skew under snapshot isolation is common: two doctors on call both see each other, both go off call, invariant (at least one on call) breaks despite each transaction passing checks
Lost updates happen at Read Committed when concurrent read modify write cycles overwrite changes, mitigate with optimistic locking or compare and set semantics
PostgreSQL SSI at over 5,000 updates per second on hot counters can see 5 to 10 percent abort rates, requiring application retry logic and idempotent operations
📌 Interview Tips
1Banking: Transaction reads balance 1000 dollars for validation, concurrent withdrawal commits 500 dollars, validation now invalid but transaction proceeds with stale value
2Reservation system: Two transactions check "no overlapping bookings for 2 PM to 3 PM" both see empty result, both insert bookings, double booking occurs
3Inventory allocation: Read stock count 50, allocate 50 units, but count was updated to 10 by concurrent transaction, oversold by 40 units
4MySQL InnoDB with range scan on secondary index (SELECT * FROM posts WHERE user_id BETWEEN 1000 AND 2000) acquires gap locks, serializing concurrent inserts and dropping write throughput 2 to 5 times
← Back to Transaction Isolation Levels Overview