Delivery Semantics and Resumability: Sequence Numbers, Acks, and Replay

Real time systems must explicitly define delivery semantics because network interruptions, server restarts, and client reconnections are inevitable at scale. The default for lowest latency is at most once delivery: the server sends events without waiting for acknowledgment, accepting that some messages may be lost during transient disconnects. For applications requiring reliability, at least once delivery uses client acknowledgments with sequence numbers and supports replay from a bounded window. The server attaches monotonically increasing sequence numbers to events within each stream (typically scoped per chat room, document, or topic), and clients track the last successfully processed sequence number. On reconnect, clients present their last seen sequence and the server replays missed events from a fast replay buffer.

Replay windows are critical for balancing reliability against server resource consumption. Maintaining an unbounded history per stream would exhaust memory and storage, so production systems typically keep replay buffers for 5 to 15 minutes in memory with optional spill to disk, expiring old entries by time or size. If a client reconnects beyond the replay window, the server instructs it to re synchronize via a full snapshot rather than attempting to replay arbitrarily old events. This bounded approach protects server resources while handling typical mobile network interruptions and brief application backgrounding.

Ordering guarantees must also be carefully scoped. TCP provides ordering within a single connection, but clients reconnecting to different shards or receiving events from multiple partitions may see out of order delivery. Production systems enforce ordering per logical stream using sequence numbers: events for a given chat room or document carry sequential IDs, and clients buffer or drop out of order messages, reconciling via deltas or snapshots when gaps exceed thresholds. Adobe's Creative Cloud collaboration services demonstrate this pattern, pushing deltas over persistent connections with CRDT like conflict resolution structures, maintaining sub 200 to 300 ms latency budgets with opportunistic local prediction and reconciliation on late or duplicate events.

💡 Key Takeaways

✓At most once delivery minimizes latency by sending without acknowledgment, accepting potential message loss during transient failures

✓At least once delivery requires client acks, server assigned sequence numbers per stream, and replay capability from a bounded window typically 5 to 15 minutes

✓Replay buffers are kept in memory with optional disk spill, expiring by time or size to bound server resource consumption per stream

✓Ordering is enforced per logical stream (room, document, topic) using monotonic sequence numbers, with clients buffering or dropping out of order updates

✓Clients reconnecting beyond the replay window must re sync via full snapshot to avoid unbounded server memory and complexity

✓Adobe maintains sub 200 to 300 ms latencies for collaborative editing with CRDT based reconciliation handling late or duplicate events from reconnects

📌 Interview Tips

1Salesforce Streaming API uses replay IDs to allow clients to resume from a specific point in the event stream, providing bounded replay windows for change data capture and platform events with typical sub second delivery

2Microsoft Fluid Framework attaches sequence numbers to document operations, supporting replay and reconciliation when editors reconnect, using CRDT/OT techniques to merge edits when perfect ordering is not maintained across partitions

← Back to WebSocket & Real-time Communication Overview