Big Data Systems • Lambda & Kappa ArchitecturesHard⏱️ ~2 min
Event Time Processing and Watermarking for Correctness
Event time processing uses the timestamp embedded in each event to determine which time window it belongs to, rather than when the system processes it. This is critical because network delays, clock skew, and system outages cause events to arrive out of order or late. A click event generated at 2:59 PM might arrive at 3:02 PM due to mobile network latency. Processing time would incorrectly place it in the 3:00 PM window, while event time correctly assigns it to the 2:00 PM hour. Both Lambda and Kappa architectures rely on event time semantics to tolerate these delays and produce correct aggregates.
Watermarks bound how late you allow events to arrive before closing a window. A watermark at time T means you believe all events with event time less than T have been processed. When the watermark passes 3:00 PM, you can finalize the 2:00 PM to 3:00 PM window and emit results. The allowed lateness parameter defines how long after the watermark you still accept late events, typically 15 minutes for ad attribution or up to 24 hours for financial corrections. Late events that arrive after allowed lateness either get dropped or trigger correction flows that adjust previously emitted windows.
Failure modes are common when event time semantics are misapplied. Daylight saving time shifts and timezone changes can shift aggregates into wrong windows, causing spikes or dips in metrics that trigger false alerts. Client clock drift from mobile devices can inject events with timestamps hours or days in the future, stalling watermarks indefinitely. The mitigation is to validate event timestamps on ingestion, capping future drift at reasonable bounds like 5 minutes, and maintaining correction paths that allow recomputation when anomalies are detected.
💡 Key Takeaways
•Event time uses timestamp embedded in each event to assign correct time window, tolerating network delays and clock skew that cause out of order arrival
•Watermarks bound late arrivals by signaling all events before time T have been processed, allowing window finalization and result emission with typical allowed lateness of 15 minutes to 24 hours
•Daylight saving time shifts and timezone changes can misplace aggregates into wrong windows, causing false metric spikes or dips that trigger incorrect alerts
•Client clock drift from mobile devices can inject events with future timestamps hours or days ahead, stalling watermarks indefinitely and blocking downstream processing
•Mitigation requires validating event timestamps on ingestion with caps on future drift around 5 minutes, plus correction paths that allow recomputation of affected windows
📌 Examples
Ad attribution system uses 15 minute allowed lateness watermark, mobile click at 2:59 PM arrives at 3:02 PM due to network lag, correctly assigned to 2:00 to 3:00 PM window before watermark closes at 3:15 PM
Financial correction system uses 24 hour allowed lateness, transaction event delayed by batch upload overnight still updates previous day aggregate within correction window
Clock drift failure: mobile device with clock set 2 days ahead sends event with future timestamp, watermark cannot advance past that time, blocking all downstream window closures until event expires or is filtered