Feature Versioning and Time Travel for Reproducible Rollback

Why Version Features
Feature versioning treats feature definitions, transformations, and schemas as immutable versioned artifacts alongside model weights. Every feature has a version identifier linking its computation logic, dependencies, and data sources. When a model trains on feature set v5, that model's manifest records v5 explicitly. This prevents the silent failures that occur when feature computation changes but models are not retrained.
Time Travel Capability
Time travel allows point in time consistent reads: querying the feature store for user 12345 as of timestamp T returns exactly the feature values that existed at T, regardless of later updates or backfills. This is not trivial to implement at scale because feature stores typically overwrite values on each update. Implementations use event sourcing (storing all value changes as a log) or snapshot isolation (periodic consistent snapshots with delta storage).
Critical Rollback Scenarios
Time travel enables two critical rollback scenarios. First, forensic debugging: when a prediction anomaly occurs, engineers can reconstruct the exact inputs the model received by time traveling to the request timestamp and comparing against expected distributions. Second, safe model reversion: rolling back to a 30 day old model requires features computed with 30 day old logic. If the feature store retains 90 days of versioned definitions and supports on demand backfill, the old model can serve current requests with compatible features.
Storage and Retention Trade-offs
Netflix and LinkedIn maintain feature lineage linking each model version to specific feature schema versions. Dataset snapshots are also time stamped and retained, allowing retraining historical models identically. The storage cost is nontrivial: at Uber's scale (petabytes of feature data), retaining 90 days of point in time snapshots requires data lifecycle policies and compaction strategies. Production best practice balances cost versus reproducibility with tiered storage: hot recent data on fast storage (7 to 14 days), warm historical data on cheaper object stores (30 to 90 days). Shorter retention reduces storage costs but limits rollback windows and forensic capabilities.

💡 Key Takeaways

✓Feature versioning treats definitions, transformations, and schemas as immutable artifacts; models record exact feature version dependencies (v5) for reproducibility and rollback

✓Time travel enables point in time consistent reads: querying features as of timestamp T reconstructs exact values that existed then, enabling forensic debugging of prediction anomalies

✓Safe rollback to 30 day old models requires features computed with 30 day old logic; feature stores retain 90 days of versioned definitions with on demand backfill capability

✓Storage cost tradeoff: at petabyte scale retaining 90 days of point in time snapshots is expensive; production systems use tiered storage with hot recent data (7 days) and warm historical data (83 days) on cheaper object stores

✓Dataset snapshots are time stamped and retained alongside feature versions, allowing identical retraining of historical models for counterfactual analysis or audit compliance

📌 Interview Tips

1LinkedIn's Venice feature store links each Pro ML model to specific feature schema versions; time travel queries enable reconstructing features served 60 days ago for diagnosing ranking anomalies

2Uber's Zipline maintains 90 days of feature lineage; when an Estimated Time of Arrival (ETA) model was rolled back after 3 weeks, on demand backfill regenerated v4 features so the old model could serve current requests without accuracy loss

← Back to Model Versioning & Rollback Overview