Message Queues & StreamingConsumer Groups & Load BalancingEasy⏱️ ~3 min

What Are Consumer Groups and How Do They Achieve Load Balancing?

Definition
Consumer groups are a coordination pattern where multiple worker processes cooperate to consume a partitioned stream. Each partition (an independent ordered sequence of records) is assigned to at most one consumer in the group at any time, enabling parallel processing while preserving ordering within each partition.

The Fundamental Parallelism Constraint

The maximum parallelism in a consumer group equals the partition count. If a topic has 100 partitions, at most 100 consumers can actively process work simultaneously. Each partition is assigned to exactly one consumer; no two consumers share a partition. Deploy 50 consumers and each handles approximately 2 partitions. Deploy 150 consumers and 50 sit idle, waiting for partitions that will never come. This makes partition count a critical decision at design time because it sets the ceiling for horizontal scaling.

How Partition Assignment Works

A group coordinator (typically running on one of the brokers) manages partition assignment. When group membership changes: a consumer joins, leaves gracefully, or crashes after missing heartbeats (periodic signals that indicate the consumer is alive), the coordinator triggers a rebalance. During rebalance, the coordinator revokes existing partition assignments and redistributes partitions among surviving members. This ensures that every partition has exactly one active consumer, preventing duplicate processing while maximizing parallelism. The assignment algorithm (round-robin, range, or sticky) determines how partitions map to consumers.

Offset Tracking for Fault Tolerance

Each consumer group tracks offsets (per-partition cursors indicating the position of the next record to consume) independently. When a consumer processes records, it periodically commits its offset to durable storage. If that consumer crashes, the coordinator reassigns its partitions to surviving consumers, which resume from the last committed offset rather than from the beginning. This checkpoint-and-resume mechanism provides fault tolerance. However, the gap between processing and committing creates the possibility of duplicate processing: records processed after the last commit but before the crash will be replayed when a new consumer takes over.

At-Least-Once Delivery Default

The standard pattern is process-then-commit: consume a batch of records, process them, then commit the offset. This provides at-least-once delivery: every record is processed at least once, but crashes between processing and committing result in some records being processed twice. The alternative, commit-then-process, risks data loss if a crash occurs after committing but before processing. Most systems prefer duplicate processing (which downstream systems can handle with idempotency) over data loss (which is often unrecoverable).

Key Insight: Consumer groups provide parallel processing with ordering guarantees per partition. The trade-off is that parallelism is bounded by partition count, and at-least-once delivery means designing for duplicate handling.
💡 Key Takeaways
Maximum parallelism equals partition count; 100 partitions means at most 100 active consumers; extra consumers sit idle
Group coordinator triggers rebalance on membership changes; revokes and redistributes partitions among surviving consumers
Offset tracking enables fault tolerance; crashed consumer's partitions resume from last committed offset on reassignment
At-least-once delivery: process-then-commit means duplicates on crash; commit-then-process risks data loss
📌 Interview Tips
1Explain the parallelism constraint: if you need 200 concurrent consumers but only have 100 partitions, 100 consumers will be idle; provision partitions for future scale
2Describe offset recovery: consumer crashes after processing records 1000-1050 but only committed 1000; new consumer replays 1000-1050 (duplicates)
3Common follow-up: how to handle duplicates? Answer: idempotent processing, deduplication tables, or exactly-once semantics with transactional commits
← Back to Consumer Groups & Load Balancing Overview