Feature Engineering & Feature Stores • Point-in-Time CorrectnessMedium⏱️ ~3 min
What is Point in Time (PIT) Correctness in ML Systems?
Point in Time (PIT) correctness ensures that both training datasets and online predictions use only information that would have been available at the exact moment of prediction. Think of it as time travel reads over your feature data: for any entity at time t, the system must reconstruct the last known value as of t, not what you know now. This eliminates future leakage, where information from after the prediction timestamp contaminates your model.
The core requirement is strict separation of event time (when the fact actually happened) versus processing time (when your system saw it). A fraud detection feature computed at 3pm but reflecting transaction data from 2pm must be timestamped at 2pm, not 3pm. Without this distinction, late arriving data can leak future information into past training examples.
PIT correctness becomes critical when labels trail features (fraud confirmed days later, ad clicks happen hours after impression), when features are time window aggregates (user activity over past 7 days), or when data arrives out of order in streaming systems. Uber's Michelangelo processes 100 million to 1 billion training examples using PIT joins to ensure temporal consistency. Netflix uses snapshot based time travel on petabyte scale tables to rebuild exact historical training datasets months later for audits and model rollbacks.
The same principles as database point in time recovery apply: maintain immutable versioned histories with event timestamps, then reconstruct state at any moment using base snapshots plus an append only change log. This underpins reproducibility, letting you recreate the exact dataset that trained a deployed model despite ongoing pipeline evolution.
💡 Key Takeaways
•Event time (when fact occurred) must be separated from processing time (when system observed it) to prevent late data from contaminating past training examples
•Requires immutable versioned feature histories where each value is keyed by entity ID and event timestamp, enabling reconstruction of state at any historical moment
•Temporal as of joins select the latest feature value with timestamp less than or equal to target time, costing 1.5 to 4 times more compute than naive latest value joins at 100 million plus row scale
•Production systems like Uber Michelangelo enforce PIT joins for 100 million to 1 billion example training sets while maintaining p99 online latency under 10 to 20 milliseconds
•Critical for any supervised learning with time dependent features, delayed labels (fraud, recommendations, ads), or streaming data with out of order arrival
•Enables reproducibility by persisting exact dataset manifests with feature versions and snapshot IDs, allowing byte for byte rebuilds months later for audits and rollbacks
📌 Examples
Airbnb Zipline requires explicit event timestamp and separate ingestion timestamp for all features, using as of joins across petabyte scale tables to build training datasets with thousands of features across hundreds of models
Netflix uses Iceberg like table formats with snapshotting to rebuild multi hundred million row training datasets in hours while preserving temporal consistency, supporting time travel queries over months of history
Meta unified feature store provides time travel semantics handling tens of billions of feature reads per day with single digit to low tens of milliseconds p99 latency, versioning feature values by event time for safe backfills