Feature Engineering & Feature StoresPoint-in-Time CorrectnessMedium⏱️ ~3 min

What is Point in Time (PIT) Correctness in ML Systems?

Definition
Point in Time (PIT) correctness ensures that both training datasets and online predictions use only information that would have been available at the exact moment of prediction. It enables time travel reads over your feature data: for any entity at time t, the system reconstructs the last known value as of t, not what you know now.

Why It Matters

PIT correctness eliminates future leakage, where information from after the prediction timestamp contaminates your model. This is one of the most insidious bugs in ML systems because offline metrics look great while production performance mysteriously degrades.

Event Time vs Processing Time

The core requirement is strict separation of event time (when the fact actually happened) versus processing time (when your system saw it). A fraud detection feature computed at 3pm but reflecting transaction data from 2pm must be timestamped at 2pm, not 3pm. Without this distinction, late arriving data can leak future information into past training examples.

When PIT Becomes Critical

PIT correctness is essential when labels trail features (fraud confirmed days later, ad clicks happen hours after impression), when features are time window aggregates (user activity over past 7 days), or when data arrives out of order in streaming systems. Uber processes 100 million to 1 billion training examples using PIT joins to ensure temporal consistency.

Implementation Principle

The same principles as database point in time recovery apply: maintain immutable versioned histories with event timestamps, then reconstruct state at any moment using base snapshots plus an append only change log. This underpins reproducibility, letting you recreate the exact dataset that trained a deployed model despite ongoing pipeline evolution.

💡 Key Takeaways
Event time (when fact occurred) must be separated from processing time (when system observed it) to prevent late data from contaminating past training examples
Requires immutable versioned feature histories where each value is keyed by entity ID and event timestamp, enabling reconstruction of state at any historical moment
Temporal as of joins select the latest feature value with timestamp less than or equal to target time, costing 1.5 to 4 times more compute than naive latest value joins at 100 million plus row scale
Production systems like Uber Michelangelo enforce PIT joins for 100 million to 1 billion example training sets while maintaining p99 online latency under 10 to 20 milliseconds
Critical for any supervised learning with time dependent features, delayed labels (fraud, recommendations, ads), or streaming data with out of order arrival
Enables reproducibility by persisting exact dataset manifests with feature versions and snapshot IDs, allowing byte for byte rebuilds months later for audits and rollbacks
📌 Interview Tips
1Airbnb Zipline requires explicit event timestamp and separate ingestion timestamp for all features, using as of joins across petabyte scale tables to build training datasets with thousands of features across hundreds of models
2Netflix uses Iceberg like table formats with snapshotting to rebuild multi hundred million row training datasets in hours while preserving temporal consistency, supporting time travel queries over months of history
3Meta unified feature store provides time travel semantics handling tens of billions of feature reads per day with single digit to low tens of milliseconds p99 latency, versioning feature values by event time for safe backfills
← Back to Point-in-Time Correctness Overview