Distributed Systems PrimitivesIdempotency & Retry PatternsHard⏱️ ~3 min

Streaming Idempotency: At Least Once to Effectively Once in Event Processing

Stream processing systems such as Kafka, Kinesis, and Pulsar provide at least once delivery guarantees, meaning events may be delivered multiple times due to consumer restarts, rebalances, orchestrator retries, or partition reassignments. Converting at least once delivery into effectively once semantics requires consumer side idempotence combined with careful offset management. The standard pattern is to use a control table or log that tracks processed event identifiers with a uniqueness guarantee, wrapping the mark processed step and the business update in a single atomic transaction, and committing consumer offsets only after the transaction commits. On retry or replay, the deduplication check detects the duplicate event and no-ops, preventing double application. Uber relies on at least once ingestion from Kafka and consumer side idempotence for millions of events per minute. A common pattern is an upsert keyed by the business identifier such as trip UUID, combined with a processed events table that stores event identifier with a uniqueness constraint. The consumer reads an event, starts a transaction, checks if the event identifier exists in the processed events table, inserts the identifier if not found, applies the business update such as updating trip status, commits the transaction, and then commits the Kafka offset. If the event identifier already exists, the transaction rolls back or no-ops, and the offset is still committed because the event was safely handled. This transaction boundary prevents advancing consumption on partial failure: if the business update succeeds but the commit fails, the next consumer will replay the event, find the identifier in the processed events table, and skip re-application. Partitioning streams by a stable business key such as user identifier or order identifier is critical for preserving per key ordering and enabling convergent upserts. If all events for a given trip UUID route to the same partition and are processed in order, an upsert naturally converges to the final state regardless of replays. However, if events can arrive out of order due to multi partition sources or reordering in the pipeline, the idempotent write must be commutative or guarded by version checks or watermark comparisons to prevent stale updates from overwriting newer state. Amazon SQS First In First Out (FIFO) queues implement idempotency with a deduplication identifier and a time bounded window of typically 5 minutes. If multiple messages with the same deduplication identifier arrive within the window, only one is delivered. This is a built in request key deduplication mechanism trading storage and windowing for exactly once in window semantics, with published throughput limits that require batching to achieve higher effective messages per second.
💡 Key Takeaways
At least once delivery from Kafka or Kinesis requires consumer side idempotence to prevent duplicate application; wrap event deduplication and business update in a single atomic transaction.
Uber processes millions of events per minute using upserts keyed by trip UUID and a processed events table with uniqueness constraint, ensuring trips converge to correct state despite replays.
Commit consumer offsets only after the deduplication check and business update transaction commits; this prevents advancing consumption on partial failure and ensures replays are safely handled.
Partition streams by stable business key such as user identifier to preserve per key ordering; upserts on the key naturally converge to final state even with replays.
If events can arrive out of order across partitions, guard idempotent writes with version checks or watermark comparisons to prevent stale updates from overwriting newer state.
Amazon SQS FIFO queues provide built in deduplication with a 5 minute window using deduplication identifiers, trading throughput limits and storage for exactly once in window delivery.
📌 Examples
Uber trip event: Consumer reads event {trip_id: trip_123, event_id: evt_456, status: completed}. Transaction: INSERT into processed_events (evt_456) with unique constraint, UPSERT trips SET status=completed WHERE trip_id=trip_123, COMMIT. On replay, INSERT fails on duplicate, UPSERT is skipped, offset is committed.
Kafka consumer restart: Consumer processed events 1 to 100 and committed offset 100 but crashed before committing offset 101 after processing event 101. On restart, event 101 is replayed, deduplication check finds it in processed_events, no-op, offset 101 is committed.
Out of order events: Events {order_123, version: 2, timestamp: T2} and {order_123, version: 1, timestamp: T1} arrive out of order. Consumer checks version on upsert: UPDATE orders SET data=... WHERE order_id=order_123 AND version < 2. Stale event with version 1 fails the condition and is skipped.
SQS FIFO deduplication: Producer sends message with MessageDeduplicationId: msg_abc within a 5 minute window. SQS stores the identifier and delivers the message. A duplicate with the same identifier within 5 minutes is silently dropped, ensuring exactly once delivery to the consumer within the window.
← Back to Idempotency & Retry Patterns Overview
Streaming Idempotency: At Least Once to Effectively Once in Event Processing | Idempotency & Retry Patterns - System Overflow