Database Design • Distributed SQL (CockroachDB, Spanner)Medium⏱️ ~2 min
Trade-offs: Distributed SQL vs Single Node and NoSQL
Choosing distributed SQL means accepting higher write latency in exchange for horizontal scale and automatic failover. A well tuned single node PostgreSQL or MySQL instance can commit transactions in under 1 millisecond and handle tens of thousands of transactions per second on a single powerful server. Distributed SQL pays at least one network round trip for consensus: 2 to 5 milliseconds within a region, or 50 to 150 milliseconds cross region at the 95th percentile. You gain the ability to survive datacenter failures and scale beyond a single machine, but every write becomes more expensive.
Compared to manually sharded relational databases like Vitess at YouTube or the sharded MySQL systems at DoorDash, distributed SQL removes application complexity. With manual sharding, your application code must route queries to the correct shard, handle cross shard transactions with application level coordination, and manage shard rebalancing when traffic patterns change. Distributed SQL hides all of this behind standard SQL, but introduces stricter clock synchronization requirements and more complex internal mechanisms that can be harder to debug during incidents.
Against NoSQL systems like Cassandra or DynamoDB that offer eventual consistency, distributed SQL provides strong consistency, joins, and relational schemas. This simplifies application logic enormously: you never need conflict resolution code, you can use foreign keys and transactions, and you can trust that reads reflect all prior writes. The cost is lower write throughput at the same hardware budget because consensus requires majority acknowledgment, stricter availability requirements since you need a majority of replicas online, and less flexibility during network partitions where eventually consistent systems can continue accepting writes in all partitions.
Distributed SQL shines for workloads that need both correctness and scale. Financial ledgers at companies like Coinbase, global user identity systems, order management platforms, and configuration state all benefit from strong consistency at scale. For content feeds, analytics pipelines, or caching layers where stale data is acceptable and write throughput is critical, eventually consistent stores or even single node databases with read replicas remain simpler and cheaper solutions.
💡 Key Takeaways
•Single node database commits in under 1ms but cannot scale beyond one machine; distributed SQL adds 2 to 5ms regional or 50 to 150ms global latency for consensus
•Manual sharding requires application code for shard routing and cross shard coordination; distributed SQL hides complexity but adds clock synchronization requirements
•NoSQL eventual consistency offers higher write throughput and partition tolerance; distributed SQL provides strong consistency and relational model at lower throughput
•Distributed SQL requires majority of replicas available; loss of majority makes shard unavailable unlike eventually consistent systems that accept writes in any partition
•Best fit workloads: financial ledgers, identity systems, order management where correctness matters more than raw throughput or lowest possible latency
📌 Examples
PostgreSQL: under 1ms commits, 50K+ TPS on single server, no automatic failover vs CockroachDB: 5ms commits, unlimited TPS via sharding, survives zone failures
YouTube Vitess: application routes queries by shard key, cross shard joins in app code vs Spanner: standard SQL handles all routing and cross shard transactions
Cassandra: 100K+ writes/sec per node, accepts writes during partition, eventual consistency vs Spanner: lower write rate, unavailable without majority, strong consistency