Message Queues & Streaming • Delivery Guarantees (At-least-once, Exactly-once)Easy⏱️ ~2 min
What Are Delivery Guarantees in Message Queues?
Delivery guarantees describe what a consumer can expect when reading messages from a queue or stream. The three fundamental guarantees are at most once (may lose messages but never duplicate), at least once (may duplicate but never lose), and exactly once (no loss and no duplicates).
In production systems, true exactly once delivery over unreliable networks is impossible. What real systems provide is exactly once processing semantics at the application boundary. This means that even though a message might be delivered multiple times over the network, its side effects (database writes, API calls, charges) happen exactly once. This is achieved by combining at least once delivery with idempotent processing, where the same operation repeated multiple times produces the same result.
The choice between these guarantees fundamentally trades off operational simplicity versus duplicate handling complexity. At least once systems achieve higher throughput (unbounded with partitioning) because they avoid expensive coordination. Exactly once processing systems accept lower throughput (often 10 to 40 percent reduction) and higher latency (tens of milliseconds added) in exchange for eliminating duplicates at the business logic layer.
💡 Key Takeaways
•At least once delivery means messages can arrive multiple times if the consumer crashes before acknowledging, causing duplicates but never losing data
•Exactly once processing (not delivery) combines at least once transport with idempotent operations or atomic commits to ensure side effects happen once
•At least once systems achieve unbounded throughput with partitioning, while exactly once processing typically reduces throughput by 10 to 40 percent due to coordination overhead
•Amazon SQS Standard queues provide at least once with nearly unlimited throughput, while FIFO queues cap at roughly 300 to 3,000 messages per second for exactly once processing
•Most high scale systems default to at least once and implement application level idempotency, reserving exactly once for financial and ledger flows where duplicates are costly
📌 Examples
Google Cloud Pub/Sub defaults to at least once delivery with typical in region latency of tens to low hundreds of milliseconds. If a subscriber does not acknowledge within the deadline (default 10 seconds), messages are redelivered and may arrive out of order.
Amazon Kinesis Data Streams provides at least once delivery with approximately 1 MB/s or 1,000 records/s write capacity per shard and 2 MB/s read capacity. Enhanced fan out delivers around 70 ms producer to consumer latency.
LinkedIn operates Kafka at multi trillion messages per day scale across thousands of brokers, defaulting to at least once delivery and implementing exactly once processing via idempotent writes and transactional semantics in stream processors.