Database Design • Relational vs NoSQLHard⏱️ ~3 min
Implementation Patterns for Production Systems
Data modeling patterns differ fundamentally between relational and NoSQL. In relational systems, normalize to eliminate duplication and let the database enforce referential integrity via foreign keys. Denormalize selectively with materialized views when read latency dominates. In NoSQL, model around aggregates and access paths: choose partition keys to evenly distribute load and co locate data read together. Precompute "pre joined" documents or rows that match query shapes. For time series data, bucket by time window plus a random or hash suffix to spread writes while enabling range queries per bucket.
Consistency and correctness patterns require careful design in NoSQL. Prefer session semantics for user facing read after write scenarios. For critical invariants like uniqueness or counters, use conditional updates with optimistic concurrency control via compare and set or version checks, or route through a single writer partition. For globally unique identifiers (IDs), use monotonic ID generators or centralized allocators to avoid collisions. Avoid per region generators without coordination that can produce duplicate IDs.
Multi region architectures present distinct tradeoffs. For strong consistency, place quorums geographically to balance latency versus availability, such as a 5 replica majority across 3 regions. Expect 10s to 100s of milliseconds added write latency across continents due to consensus. For eventual consistency with active active writes in each region, define conflict resolution rules like lexicographic maximum timestamp with a tie breaker, implement asynchronous replication, and design user interfaces to tolerate temporary divergence.
Observability and repair are critical in production. Use anti entropy and repair processes for eventually consistent stores by regularly checksumming partitions and reconciling diverged replicas. For relational replicas, monitor replication lag via log sequence numbers or operation timestamps and automatically fence stale replicas. Implement data Time To Live (TTL) carefully because bulk expirations can cause thundering herds and compaction storms; stagger expirations and budget background Input/Output (I/O) to avoid disrupting foreground traffic.
💡 Key Takeaways
•Relational data modeling: normalize with foreign keys and selectively denormalize via materialized views for read latency; NoSQL: denormalize around aggregates with partition keys co locating related data and pre joined documents matching query shapes
•Consistency patterns in NoSQL: use session semantics for read your writes; enforce critical invariants via conditional updates with compare and set or route through single writer partitions; use centralized allocators for globally unique IDs
•Multi region strong consistency: place quorums across regions (e.g., 5 replicas across 3 regions) adding 10s to 100s ms write latency; eventual multi master requires conflict resolution via lexicographic timestamp with tie breaker and reconciliation tooling
•Partition key design for time series: bucket by time window plus hash suffix to spread writes while enabling range queries per bucket; keep per partition item counts and sizes bounded to avoid compaction stalls
•Secondary indexes as separate data products: update asynchronously via change data capture with idempotent upserts; define repair and rebuild procedures; expose precise versus eventual endpoints to callers
•Observability patterns: use anti entropy checksumming for eventual stores; monitor replication lag and fence stale replicas; stagger TTL expirations and budget background I/O to avoid thundering herds and compaction storms
📌 Examples
Amazon DynamoDB time series: partition key combines device ID with hourly bucket, spreading writes across partitions while enabling efficient range queries within each bucket
Google Spanner multi region quorum: 5 replica configuration across 3 continents with majority (3 of 5) commits, adding 50 to 150 ms write latency for external consistency
Netflix Cassandra materialized views: asynchronously updated via change streams with idempotent upserts, exposing eventual consistency SLA to callers for top N recommendations
E-commerce inventory with strong consistency: conditional update with version field for stock decrement, rolling back order if compare and set fails due to concurrent depletion