Partitioning, Consumer Groups, and Horizontal Scaling

The Scalability Problem:

A single consumer reading millions of events per second from a single log cannot keep up. The bottleneck is both network throughput and processing capacity. To scale, you need parallelism. But naive parallelism breaks ordering guarantees. If Event A and Event B for the same user are processed by different consumers in parallel, they might be processed out of order.

Partitioning solves this by dividing the stream into independent, ordered sub streams. Consumer groups coordinate parallel consumption while preserving per partition ordering.

How Partitioning Works:

The event log is divided into partitions (typically 100 to 1000 partitions for clusters handling millions of events per second). When a producer writes an event, it computes a hash of the event key (like user_id or order_id) and maps it to a partition. All events with the same key land in the same partition.

This preserves per key ordering. All purchase events for user 12345 go to partition 47 (for example), and partition 47 maintains strict ordering. Events for different users go to different partitions and can be processed in parallel without coordination.

Consumer Groups for Parallel Consumption:

A consumer group is a set of consumer instances that cooperate to consume partitions. The system assigns each partition to exactly one consumer instance within the group. If you have 200 partitions and 10 consumer instances, each instance reads 20 partitions.

This enables horizontal scaling. Need more throughput? Add consumer instances to the group. The system rebalances automatically, redistributing partitions across the larger pool. The maximum parallelism is the number of partitions: if you have 200 partitions, adding a 201st consumer does nothing because it gets no partitions assigned.

Offsets and Fault Tolerance:

Each consumer tracks its read position in each partition via an offset (a sequential integer). Consumers periodically commit offsets to durable storage. If a consumer crashes, another instance in the group takes over its partitions and resumes from the last committed offset.

This design yields at least once delivery. If a consumer processes events 1000 to 1100, crashes before committing offset 1100, the replacement consumer restarts from offset 1000 and reprocesses those events. For exactly once semantics, you need idempotent consumers or transactional coordination.

Scaling Impact
1
CONSUMER
10
CONSUMERS
10x
THROUGHPUT
⚠️ Common Pitfall: Choosing the wrong partition key creates hotspots. If you partition by country_code and 80% of traffic is from one country, 80% of events land in one partition and one consumer.

💡 Key Takeaways

✓Partitioning divides the stream into independent ordered sub streams, enabling parallel consumption without breaking per key ordering

✓Events with the same key (like <code>user_id</code>) always go to the same partition via hash based routing, preserving ordering for that key

✓Consumer groups coordinate parallel reads: each partition is consumed by exactly one instance within the group

✓Maximum parallelism equals number of partitions: 200 partitions supports up to 200 consumer instances, no more

✓Offsets track read position per partition; committed offsets enable fault tolerance and at least once delivery semantics

📌 Interview Tips

1E-commerce stream with 200 partitions partitioned by <code>order_id</code>: all events for order 98765 land in partition 142

2Consumer group with 10 instances processes 1 million events per second: each instance handles 20 partitions at 100,000 events per second

3Consumer crashes after processing events 1000 to 1100 but before committing offset; replacement resumes from offset 1000 and reprocesses

4Hotspot example: partitioning by <code>product_category</code> where 70% of purchases are electronics creates one overloaded partition

← Back to Event Streaming Fundamentals Overview