Partitioning and Ordering: Scaling Message Throughput

Horizontal Scale Through Partitioning:

Message queues achieve horizontal scale through partitioning, where a logical queue is split into multiple independent shards that can be processed in parallel. The fundamental constraint is that each partition can only be consumed by one instance at a time to preserve ordering within that partition. If you have 10 partitions, you can run at most 10 consumer instances in parallel; the 11th instance sits idle. This means your maximum parallelism equals your partition count.

Per-Partition Ordering:

Ordering guarantees apply per partition, not globally. When you publish a message, you specify a partition key (order ID, user ID, session ID); messages with the same key always go to the same partition and are processed in order. This is why high cardinality keys matter: if you partition by country and 80% of your traffic comes from one country, 80% of messages hit one partition, creating a hot shard that becomes your bottleneck.

Partition Math
200
MSG/SEC/CONSUMER
100
PARTITIONS FOR 20K/SEC
Partition Count Planning:

If each consumer instance can process 200 messages per second and your peak load is 20,000 messages per second, you need at least 100 partitions to distribute the load. Partition count is often hard to change after creation (Kafka requires manual rebalancing, SQS FIFO groups are implicit), so overprovisioning is common. LinkedIn Kafka clusters often run with hundreds or thousands of partitions per topic.

Ordering Trade-offs:

Strict global ordering serializes all processing to a single consumer, destroying parallelism. Per-key ordering preserves correctness (all updates to order 12345 happen in sequence) while allowing millions of different orders to be processed concurrently. When you do not need ordering at all, Amazon SQS Standard queues with best-effort ordering can scale to effectively unlimited throughput.

💡 Key Takeaways

✓Maximum consumer parallelism equals partition count: with 50 partitions you can run 50 concurrent consumers; adding a 51st provides no benefit as it will sit idle waiting for partition assignment

✓Hot partition problem: if you partition by tenant_id and your largest tenant generates 40% of traffic, that partition becomes a bottleneck; solution is to use composite keys like tenant_id plus shard_index to split hot keys across multiple partitions

✓Partition sizing guideline: if each consumer handles 200 messages per second and peak load is 20,000 messages per second, provision at least 100 partitions; add 50 to 100% headroom for growth since repartitioning is operationally expensive

✓Ordering cost: Amazon SQS FIFO with strict ordering within message groups traditionally capped at 300 transactions per second per group, while Standard queues with no ordering scale to nearly unlimited throughput

✓Key cardinality matters: using user_id as partition key for a consumer app with 10 million active users provides excellent distribution; using country code with 80% traffic from one country creates extreme skew

📌 Interview Tips

1Kafka at LinkedIn: topics commonly configured with 100 to 1000 partitions to enable fine grained parallelism; consumer groups scale horizontally by assigning partitions across instances, with rebalancing triggered by scaling events

2Google Cloud Pub/Sub ordering keys: messages with the same ordering key are delivered in order to a single subscriber instance; a high volume topic might use order_id as the key, ensuring all events for order 98765 are processed sequentially while other orders process in parallel across the subscriber fleet

← Back to Message Queue Fundamentals Overview