Learn→Message Queues & Streaming→Delivery Guarantees (At-least-once, Exactly-once)→2 of 6

Message Queues & Streaming • Delivery Guarantees (At-least-once, Exactly-once)Medium⏱️ ~2 min

Implementing At Least Once Delivery

Acknowledgment-Based Delivery:

At-least-once delivery is implemented by acknowledging receipt only after processing completes. If a consumer crashes before sending the acknowledgment, the broker considers the message undelivered and redelivers it to another consumer or to the same consumer after recovery. This causes duplicates but guarantees no messages are lost.

Visibility Timeout Mechanics:

The core mechanism relies on visibility timeouts or acknowledgment deadlines. In Amazon SQS, when a consumer receives a message, it becomes invisible to other consumers for a configurable visibility timeout (default 30 seconds, maximum 12 hours). If the consumer does not delete (acknowledge) the message within this window, it becomes visible again. Google Cloud Pub/Sub uses a similar ack deadline (default 10 seconds, extendable to 10 minutes).

⚠️ Common Pitfall: If processing often exceeds the deadline, messages will redeliver mid-processing, magnifying duplicates under load.
Trade-off:

At-least-once systems avoid expensive coordination like two-phase commits or distributed transactions, enabling higher throughput and better availability during network partitions. Amazon SQS Standard queues scale to nearly unlimited throughput, while systems requiring stronger guarantees often cap at hundreds to low thousands of messages per second per partition.

💡 Key Takeaways

✓Visibility timeouts control redelivery windows. Size timeouts to p99 processing time with headroom. If p99 is 2.5 seconds and deadline is 3 seconds, only 500 milliseconds of headroom remains before redelivery storms occur.

✓Crashes after applying side effects but before acknowledging cause duplicate application. Mitigation requires idempotent side effects via unique operation identifiers and conditional writes.

✓At least once enables higher throughput because it avoids coordination overhead like distributed transactions or two phase commits, making it suitable for analytics aggregation, counters, search indexing, and cache warming.

✓Amazon SQS visibility timeout ranges from 0 seconds to 12 hours (default 30 seconds). Messages not acknowledged within this window become visible again and are redelivered to potentially different consumers.

✓Google Cloud Pub/Sub allows extending ack deadlines up to 10 minutes via streaming pull to handle long processing, but unacked messages are retained and redelivered during subscriber restarts or network partitions.

📌 Interview Tips

1AWS Lambda with Kinesis or DynamoDB Streams uses at least once invocation. Amazon recommends idempotent handlers (e.g., conditional updates in DynamoDB with unique keys) to avoid duplicate side effects during retries, shard rebalancing, or function timeouts.

2An analytics pipeline consuming clickstream events at 50,000 messages per second can tolerate duplicates because downstream aggregations are eventually consistent. Using at least once delivery with idempotent upserts to a data warehouse achieves high throughput without expensive coordination.

3A cache warming service consuming product catalog updates uses at least once delivery. Duplicate cache writes are harmless (idempotent) and the system prioritizes availability and throughput over strict deduplication.

← Back to Delivery Guarantees (At-least-once, Exactly-once) Overview