Freshness vs Point in Time Correctness
Real Time Streaming Freshness
Delivers maximum freshness with seconds to minutes of staleness, enabling models to react immediately to user behavior changes. Meta Ads ranking achieves sub second freshness for critical CTR counters, allowing the system to downweight ads experiencing sudden engagement drops within seconds. However, streaming systems face inherent correctness challenges from late arriving events, out of order processing, and clock skew across distributed producers.
Point in Time Correctness
The gold standard for training data to prevent label leakage. When building a training example with label timestamp T, feature joins must only include data with event time less than or equal to T and within defined windows. Without this guarantee, a churn prediction model trained on features accidentally including information from after the churn event will show optimistic offline AUC that collapses in production. Airbnb's Zipline enforces point in time joins through automated snapshot versioning.
The Tension
Streaming achieves freshness through continuous incremental updates that are eventually consistent, while point in time correctness demands reproducible snapshots with strict event time semantics. A streaming counter for "purchases in last 24 hours" might be missing late events that arrive hours later due to mobile offline sync or timezone issues. Training on this incomplete stream creates distribution shift versus the complete offline batch computation.
Dual Path Architecture
Production systems reconcile this by running both paths. Online streaming features optimize for freshness and accept eventual consistency, using watermarks and allowed lateness windows (typically 1 to 6 hours). Offline batch features reprocess the same event streams with longer grace periods (24 to 72 hours) to achieve completeness and point in time correctness. Periodic reconciliation jobs compare online state against offline recomputation, alerting when divergence exceeds thresholds.