CDC Distribution and Partitioning Strategy

The Distribution Problem: You have captured 50,000 change events per second from your database log. Now you need to publish them into a distributed event stream that multiple consumers can read in parallel. But some consumers need strict ordering for a given entity (like all changes to order 12345 in sequence). How do you balance parallelism with ordering guarantees?

Partitioning by Entity Key: The solution is partition by primary key or entity ID. Each change event is assigned to a partition based on a hash of its key. All events for order 12345 always land in the same partition, say partition 47 out of 100 total partitions. This preserves per key ordering while allowing horizontal scale.

With 100 partitions and 50k writes per second, each partition handles roughly 500 writes per second. Modern distributed logs like Kafka can handle 1,000 to 5,000 writes per second per partition with p99 latency under 100ms. This headroom is critical for handling traffic spikes.

Partition Scaling Math
50-200
PARTITIONS
250-1k
WRITES/SEC/PARTITION
3-7 days
RETENTION
Retention for Replay and Recovery: The distributed log retains events for 3 to 7 days, even though the database commit log might only keep 48 hours. This longer retention serves two purposes. First, if a downstream consumer falls behind or needs to backfill, it can replay days of history without touching the OLTP database. Second, it provides time to diagnose and fix consumer bugs without data loss.

Durability Through Replication: Each partition is replicated across multiple availability zones. Typically 3 replicas with acknowledgment from at least 2 before considering a write committed. This adds 10 to 30ms to publish latency but ensures no data loss on single node failures.

⚠️ Hot Key Problem: If one entity (like a viral post) receives 10% of all writes, that single partition becomes a bottleneck. Some systems split hot keys into sub partitions or use composite keys to distribute load.
Multiple Consumer Groups: Different downstream systems each run their own consumer group. The search indexing team reads from offset X, the analytics team from offset Y, and the cache invalidation service from offset Z. They progress independently. One slow consumer does not block others because the distributed log decouples them.

This architecture is why Netflix can have dozens of teams consuming the same CDC stream for different purposes, each with their own latency requirements. Search might target 1 second freshness, while the data lake batches for 5 minute freshness to optimize storage writes. The partition based distribution makes this flexibility possible.

💡 Key Takeaways

✓Partitioning by primary key hash preserves per key ordering while enabling horizontal scaling, with typical deployments using 50 to 200 partitions

✓Each partition handles 250 to 1,000 writes per second at scale, leaving headroom for traffic spikes before hitting throughput limits

✓Retention of 3 to 7 days in the distributed log allows downstream consumers to replay or backfill without impacting the OLTP database

✓Replication across 3 availability zones with quorum acknowledgment adds 10 to 30ms latency but prevents data loss on failures

✓Independent consumer groups let different teams consume the same stream at their own pace without blocking each other

📌 Interview Tips

1With 50k writes per second and 100 partitions, each partition receives roughly 500 writes per second, well below the 1,000 to 5,000 per second capacity limit

2Netflix runs dozens of consumer groups on the same CDC topics, with search targeting 1 second lag and data warehouse batching for 5 minute freshness

← Back to CDC at Scale & Performance Overview