Message Queues & StreamingMessage Ordering & PartitioningMedium⏱️ ~2 min

Scaling Throughput with Partitions: Cloud Service Limits

Partitioning is the primary mechanism to scale throughput while preserving per key ordering. By spreading keys across many partitions, systems increase parallelism and aggregate throughput: 10 partitions with 1,000 messages per second each yield 10,000 messages per second total system throughput. Each partition maintains its own ordered log and dedicated consumer, allowing independent parallel processing. Cloud messaging services have documented per partition throughput ceilings that directly constrain your scaling math. Amazon Kinesis shards support up to 1 MB per second write and 2 MB per second read, or 1,000 records per second write regardless of size. Azure Event Hubs provisions throughput units (TU) where each TU provides roughly 1 MB per second ingress and 2 MB per second egress. Amazon SQS FIFO queues cap at 3,000 messages per second with batching (or 300 messages per second without batching) per queue, but you scale by using multiple message groups within the queue, with each group acting like a partition. Capacity planning flows from consumer throughput backwards. If your consumer can process 500 messages per second per partition while preserving serial order, and you need 10,000 messages per second total, you need at least 20 partitions. Add 30 to 50 percent headroom for rebalances, spikes, and uneven distribution. LinkedIn runs clusters with millions of partitions across thousands of brokers to handle trillions of messages daily, keeping per partition throughput manageable (typically hundreds to low thousands of messages per second per partition). Adding partitions is not free. Hash based mapping (hash of key mod N) changes when N changes, remapping many keys to different partitions and disrupting consumer cache locality and downstream storage affinity. Consumer group rebalances pause consumption temporarily during reassignment; naive implementations create stop the world effects. Start with fewer, hotter partitions to minimize coordination overhead, then scale out as load grows and monitoring reveals bottlenecks.
💡 Key Takeaways
Partitions scale throughput linearly with partition count. 10 partitions at 1,000 messages per second each equals 10,000 messages per second total; doubling partitions doubles aggregate throughput while maintaining per key order.
Cloud services have hard per partition limits. Kinesis shards: 1 MB per second write, 2 MB per second read, 1,000 records per second write. Event Hubs TUs: roughly 1 MB per second ingress, 2 MB per second egress. SQS FIFO: 3,000 messages per second per queue with batching.
Capacity plan from consumer processing rate. partitions equals target throughput divided by per consumer throughput. If consumers handle 500 messages per second and you need 10,000 messages per second, provision 20+ partitions with headroom.
Adding partitions disrupts key mapping and triggers rebalances. hash of key mod N changes when N changes, remapping keys to new partitions. Consumer rebalances pause processing; cooperative protocols minimize disruption but still add latency.
Start conservative and scale out based on observed load. Fewer initial partitions reduce coordination overhead; LinkedIn commonly starts new topics at 4 to 32 partitions, monitoring per partition throughput and lag before expanding.
📌 Examples
Kinesis stream scaling: application needs 50 MB per second write throughput, each shard supports 1 MB per second write, provision 50+ shards with headroom (typically 65 to 75 shards for 30 to 50 percent buffer)
Azure Event Hubs: event ingestion at 80 MB per second requires 80 TUs at 1 MB per second ingress each; commonly provision 16 partitions (5 MB per second per partition) to distribute load and enable 16 concurrent consumers
Amazon SQS FIFO: order processing needs 9,000 messages per second with ordering per order ID; use 3+ message groups (each supporting 3,000 messages per second) and partition orders across groups by hash of order ID
← Back to Message Ordering & Partitioning Overview