Failure Modes in Read and Write Optimized Systems

Handling Mixed Workloads
Most real systems have both read and write demands. The challenge is balancing optimization strategies that often conflict. Heavy indexing speeds reads but slows writes. Write buffering improves throughput but increases read latency for recent data. Design requires understanding the actual read/write ratio and latency requirements.
Workload Segregation Patterns
Separate read and write paths to optimize each independently. CQRS (Command Query Responsibility Segregation) uses different models for writes (commands) and reads (queries). Writes go to a normalized, write-optimized store. An asynchronous process transforms this data into denormalized, read-optimized views.
Read replicas handle this at the database level. The primary handles all writes; replicas serve read traffic. Replication lag (delay between primary write and replica visibility) introduces eventual consistency. Applications must tolerate or work around stale reads. Log Sequence Numbers (LSN—monotonically increasing identifiers for each write operation) help track replication progress.
Partitioning Strategies
Partition data to distribute load. Time-based partitioning puts recent data (hot) on fast storage with write-optimized configuration, older data (cold) on cheaper storage optimized for analytical reads. Hot partitions accumulate writes; cold partitions serve historical queries.
Key-based partitioning (sharding) spreads writes across nodes. Choose partition keys carefully: sequential keys (timestamps, auto-increment IDs) concentrate writes on one partition (hot spots). Hash-based keys distribute evenly but lose range query efficiency. Composite keys can balance both needs.
Capacity Planning Considerations
Write-heavy systems need different hardware profiles: fast sequential I/O, more RAM for write buffers, network bandwidth for replication. Read-heavy systems prioritize IOPS, CPU for query processing, and cache capacity. Monitor actual usage patterns—assumptions about read/write ratios are often wrong.
Time-to-live (TTL—automatic deletion of data after a specified duration) policies prevent unbounded growth in high-write systems. Archival strategies move old data to cheaper storage tiers. Compaction and garbage collection need dedicated resources to prevent them from competing with production traffic during peak hours.

💡 Key Takeaways

✓Cache stampede from popular key expiration generates 10K+ concurrent origin requests in milliseconds, spiking p99 from 10ms to seconds: mitigate with request coalescing and stale while revalidate

✓Hot partitions route 80% of traffic to one shard when celebrity or viral content hits: Twitter uses hybrid fanout, avoiding 30M writes by computing high follower timelines on read

✓Replication lag causes read after write anomalies when async replicas fall 10+ seconds behind: route authors to primary or track LSN to enforce monotonic reads

✓Fanout write storms amplify single viral event into millions of writes: require selective fanout, batch compression, and priority queues with backpressure

✓Multi master conflicts from clock skew and concurrent updates: last write wins with unsynchronized clocks drops updates, requires logical clocks or Conflict free Replicated Data Types (CRDTs)

📌 Interview Tips

1Twitter celebrity tweet: 30M followers would cause fanout storm, instead system computes timeline on read for high follower accounts and caches result

2Meta TAO employs request coalescing for cache misses: multiple concurrent requests for same key deduplicated to single origin fetch, preventing stampede

3LinkedIn feed consumers: during traffic spikes, consumer lag can grow to minutes or hours, requiring autoscaling and monitoring of partition lag metrics

← Back to Read-Heavy vs Write-Heavy Optimization Overview