Message Queues & Streaming • Message Queue FundamentalsMedium⏱️ ~3 min
Delivery Semantics: At Most Once, At Least Once, and Exactly Once
Delivery semantics define the contract between queue and consumer when failures occur, and this choice fundamentally shapes your system architecture. At most once delivery means the queue sends each message once and never retries: if the consumer crashes or the network drops the message, it's lost forever. This gives you the lowest latency (no acknowledgment waiting, no retry logic) but risks data loss, making it suitable only for non-critical telemetry or metrics where occasional loss is acceptable.
At least once delivery, the practical default for most production systems, guarantees every message is delivered but may deliver duplicates. If a consumer processes a message but crashes before acknowledging, the queue will redeliver it. Amazon SQS Standard queues use this model: after a visibility timeout (default 30 seconds), unacknowledged messages reappear for other consumers. This requires idempotent consumer logic: processing the same order confirmation twice must not charge the customer twice. Google Cloud Pub/Sub similarly delivers messages at least once, with acknowledgment deadlines typically set to 10 seconds and automatic extensions available.
Exactly once semantics promise each message is processed precisely once, but in distributed systems this is usually "exactly once effect" rather than true exactly once delivery. It's typically built atop at least once delivery using transactional patterns: Kafka transactions combine idempotent producers (deduplicating on the write side) with consumer offsets stored transactionally alongside results. Azure Service Bus supports transactions across message receipt and state changes within a single resource. The cost is higher latency (coordinating transactions) and significantly more complexity.
In practice, most teams implement at least once delivery with idempotent consumers using deduplication keys. You store processed message IDs in a cache or database with a time to live matching your retry window (24 to 72 hours). When a duplicate arrives, you check the dedupe store and skip reprocessing. This achieves exactly once effects at the application level without the infrastructure complexity of distributed transactions.
💡 Key Takeaways
•At most once eliminates retry overhead for lowest latency (~10 ms) but accepts data loss; Netflix uses this for non-critical view counts and real time analytics where occasional loss doesn't impact user experience
•At least once requires idempotent handlers: use database upserts with unique constraints, conditional writes checking version numbers, or dedupe stores with 24 to 72 hour time to live matching your maximum retry window
•Amazon SQS visibility timeout (default 30 seconds, configurable up to 12 hours) hides messages during processing; if your consumer doesn't acknowledge within this window, the message reappears and may be processed twice
•Exactly once via transactions adds significant latency: Kafka transactions increase end to end latency from single digit milliseconds to 50 to 100 milliseconds due to coordinator round trips and log syncs
•Deduplication keys in practice: store a hash of message ID plus key business fields in Redis with time to live of 72 hours; check before processing and skip if present, achieving exactly once effects without distributed transactions
📌 Examples
Microsoft Azure Service Bus FIFO queues with sessions: enable duplicate detection with configurable windows up to 7 days, automatically discarding messages with duplicate MessageId within that window
Idempotent order processing: store order_id in a processed_orders table with a unique constraint; when processing an order, attempt to insert the order_id first; if the insert fails due to duplicate key, skip processing and acknowledge the message