Global Time and Timestamp Ordering in Distributed SQL
The Global Ordering Problem
The hardest problem in distributed SQL is ensuring transactions appear in a globally consistent order. When thousands of nodes process operations simultaneously across different datacenters, two transactions on opposite sides of the world might both claim to have committed first. Without synchronized time, the system cannot determine which truly happened first, breaking serializability (the guarantee that concurrent transactions produce the same result as if they executed one at a time in some serial order).
Unlike single-node databases where a local clock orders all operations, distributed SQL must establish ordering across nodes with independent clocks that drift relative to each other. Physical clocks can differ by milliseconds or even seconds. The challenge is creating a timestamp scheme that respects causality (if event A happens before event B, A always gets a lower timestamp) while enabling high-throughput concurrent operations.
TrueTime: Bounded Clock Uncertainty
One approach uses specialized hardware to achieve tight clock synchronization. Datacenters deploy atomic clocks and GPS receivers that provide time with known uncertainty bounds. A TrueTime API returns an interval rather than a single timestamp, indicating "the current time is definitely between lower and upper bounds, typically 1 to 7 milliseconds apart." This bounded uncertainty enables external consistency (the strongest form of distributed consistency where transactions appear to execute at a single point in real time).
When a transaction commits, the system performs a commit wait: it delays returning success to the client until the commit timestamp is definitely in the past according to every possible observer. This wait equals the clock uncertainty, often 1 to 7 milliseconds. The trade off is clear: tighter clock synchronization means shorter waits, which is why atomic clocks and GPS matter for performance, not just correctness.
Hybrid Logical Clocks
An alternative approach combines physical timestamps with logical counters. Hybrid Logical Clocks (HLC) track both a physical time component (from the local system clock) and a logical counter that increments when events happen faster than the physical clock advances. When a node receives a message with a timestamp ahead of its local clock, it advances its HLC to that timestamp plus a logical increment, preserving causality without requiring atomic clocks.
HLC still requires clock drift to stay within configured bounds, typically 500 milliseconds maximum offset between any two nodes. If a node detects its clock has drifted beyond this threshold, it shuts itself down to protect consistency. This is a critical operational requirement: NTP (Network Time Protocol) must run reliably across all nodes, and monitoring must alert on clock drift before it triggers shutdowns.
MVCC and Timestamp Ordering
Both timestamp schemes integrate with MVCC (Multi-Version Concurrency Control), where every write creates a new timestamped version of a key rather than overwriting in place. The storage layer maintains multiple versions, enabling reads at any past timestamp. A read transaction picks a snapshot timestamp and sees exactly the database state as of that moment, unaffected by concurrent writes with later timestamps.
Multi-range transactions use Two-Phase Commit (2PC): the coordinator picks a provisional timestamp, sends writes to all involved ranges, waits for prepared acknowledgments, then commits at a final timestamp. 2PC ensures atomicity (all ranges commit or none do), and timestamp ordering ensures serializability. The practical impact: a banking application can read an account balance in Tokyo immediately after a deposit in London, with guarantees that it sees the deposit and never observes time moving backwards.