Database DesignRead-Heavy vs Write-Heavy OptimizationMedium⏱️ ~2 min

Write Heavy Optimization: Buffering and Distribution

Write heavy systems focus on smoothing bursty load, distributing writes across shards, and decoupling producers from consumers. The goal is sustained high throughput ingestion with durability, accepting eventual consistency and increased read complexity as trade offs. Buffering and batching multiply throughput by reducing per operation overhead. Batching writes in 1 to 10 millisecond windows or up to 1,000 to 10,000 items reduces system calls and filesystem syncs (fsyncs), achieving 5x to 20x throughput gains. LinkedIn's Kafka batches aggressively, handling tens of millions of messages per second at peak while maintaining durability through append only logs. The trade off is slightly higher p50 latency (the batch delay) and complexity in retry handling with idempotency keys to prevent duplication. Sharding distributes write load across multiple nodes by partitioning data using stable, high cardinality keys. Virtual shards enable incremental rebalancing without full data migration. For hot keys like celebrity accounts or viral content, sharded counters split a single logical counter across multiple physical keys, merged periodically. Without this, a single popular item can saturate one shard while others sit idle, causing p99 latency to jump from 10ms to 500ms or more. Durable queues and logs decouple write acceptance from processing. Producers append to a log and receive acknowledgment within milliseconds, while consumers process asynchronously at their own pace. This prevents backpressure from slow consumers blocking fast producers. Systems enforce limits on in flight messages and use per tenant quotas to prevent runaway producers. The cost is eventual consistency and potential consumer lag during spikes, which can delay read model updates by minutes to hours if not properly scaled.
💡 Key Takeaways
Batching writes in 1 to 10ms windows or 1K to 10K item groups achieves 5x to 20x throughput by reducing syscalls and fsyncs, with small p50 latency cost
Sharding by high cardinality keys distributes load: LinkedIn Kafka uses partitioning to handle tens of millions of messages per second across distributed brokers
Hot key mitigation: sharded counters split celebrity or viral item load across multiple keys, preventing single shard saturation that causes p99 latency spikes from 10ms to 500ms+
Durable logs decouple producers from consumers: writes acknowledge in milliseconds while consumers process asynchronously, preventing slow readers from blocking writers
Consumer lag monitoring critical: during spikes, lag can grow to minutes or hours, increasing staleness in read models and requiring autoscaling or backpressure
📌 Examples
LinkedIn Kafka transports trillions of messages daily with tens of millions per second at peak, batching aggressively and using partitioned topics for parallel consumption
Twitter handles tens of thousands of tweets per second with fanout systems that queue writes and process asynchronously to avoid blocking tweet creation on delivery
Meta uses sharded counters for viral posts to prevent single key hotspots, distributing increments across multiple counter shards merged periodically
← Back to Read-Heavy vs Write-Heavy Optimization Overview
Write Heavy Optimization: Buffering and Distribution | Read-Heavy vs Write-Heavy Optimization - System Overflow