Database Design • Relational vs NoSQLMedium⏱️ ~3 min
Operational Scale: Vertical vs Horizontal Growth
Relational databases scale up easily by adding CPU, memory, and faster storage to a single node, making them a great fit up to tens of thousands of transactions per second per node. Vertical scaling is operationally simple because there is no distributed coordination to manage. However, beyond a certain point, vertical scaling becomes expensive and hits hardware limits. Scaling out requires explicit sharding where the application or middleware routes queries to the correct shard, or adopting distributed SQL systems that handle sharding transparently via consensus protocols like Paxos or Raft. Distributed SQL adds complexity and latency due to cross shard coordination.
NoSQL databases are designed for horizontal scale from the ground up. Adding nodes linearly increases read and write capacity because data is partitioned across many nodes with no single coordinator. Netflix runs multi thousand node Cassandra clusters storing petabytes, achieving write availability and linear scale for streaming state. However, hot partitions limit per key throughput to often just a few thousand operations per second per partition. Write amplification from denormalization and secondary indexes multiplies write costs; one logical write can trigger 5 to 10 physical writes. Under load spikes, compaction in Log Structured Merge tree (LSM) stores can stall, driving tail latencies higher.
Cost considerations differ significantly. Relational databases deliver higher per node performance, requiring fewer nodes for moderate workloads, but licensing and vertical scaling can be expensive. NoSQL requires more nodes for the same logical dataset because replication and denormalization inflate storage 2 to 5 times, and operational cost scales with throughput and partition count. Choose relational for workloads up to tens of thousands of transactions per second with complex queries, and NoSQL for horizontally scalable, low latency, geo distributed systems handling millions of operations per second across well defined access patterns.
💡 Key Takeaways
•Relational databases scale vertically to tens of thousands of transactions per second per node with simple operations, but hit hardware and cost limits requiring explicit sharding or distributed SQL for further growth
•NoSQL scales horizontally by adding nodes linearly, enabling millions of operations per second, but hot partitions cap per key throughput to a few thousand operations per second
•Write amplification in NoSQL from denormalization and secondary indexes causes one logical write to trigger 5 to 10 physical writes; compaction stalls under load spikes drive tail latencies higher
•Cost tradeoff: relational delivers higher per node performance with fewer nodes but expensive licensing and vertical scaling; NoSQL requires more nodes and storage inflated 2 to 5 times by replication and denormalization
•Netflix runs multi thousand node Cassandra clusters storing petabytes for streaming state; Amazon Aurora supports millions of reads per second via read replicas for read heavy relational workloads
📌 Examples
E-commerce checkout system: relational database with vertical scaling handles 20k transactions per second for order processing with ACID guarantees and complex inventory queries
Netflix Cassandra: multi thousand node clusters storing petabytes of streaming state, chosen for write availability and linear horizontal scale across regions
Amazon Aurora: millions of reads per second via read replicas for read heavy catalog queries, with strongly consistent primary writes scaling vertically
Social media activity feeds: NoSQL with horizontal partitioning across thousands of nodes, denormalized documents pre joined for low latency reads at millions of operations per second