Model Serving & InferenceModel Versioning & RollbackHard⏱️ ~3 min

Feature Versioning and Time Travel for Reproducible Rollback

Feature versioning treats feature definitions, transformations, and schemas as immutable versioned artifacts alongside model weights. Every feature has a version identifier linking its computation logic, dependencies, and data sources. When a model trains on feature set v5, that model's manifest records v5 explicitly. Time travel capability allows point in time consistent reads: querying the feature store for user 12345 as of timestamp T returns exactly the feature values that existed at T, regardless of later updates or backfills. This enables two critical rollback scenarios. First, forensic debugging: when a prediction anomaly occurs, engineers can reconstruct the exact inputs the model received by time traveling to the request timestamp and comparing against expected distributions. Second, safe model reversion: rolling back to a 30 day old model requires features computed with 30 day old logic. If the feature store retains 90 days of versioned definitions and supports on demand backfill, the old model can serve current requests with compatible features. Netflix and LinkedIn maintain feature lineage linking each model version to specific feature schema versions. Dataset snapshots are also time stamped and retained, allowing retraining historical models identically. The storage cost is nontrivial: at Uber's scale (petabytes of feature data), retaining 90 days of point in time snapshots requires data lifecycle policies and compaction strategies. The tradeoff is cost versus reproducibility: shorter retention (7 to 14 days) reduces storage but limits rollback windows and forensic capabilities. Production best practice balances this with tiered storage: hot recent data on fast storage, warm historical data on cheaper object stores.
💡 Key Takeaways
Feature versioning treats definitions, transformations, and schemas as immutable artifacts; models record exact feature version dependencies (v5) for reproducibility and rollback
Time travel enables point in time consistent reads: querying features as of timestamp T reconstructs exact values that existed then, enabling forensic debugging of prediction anomalies
Safe rollback to 30 day old models requires features computed with 30 day old logic; feature stores retain 90 days of versioned definitions with on demand backfill capability
Storage cost tradeoff: at petabyte scale retaining 90 days of point in time snapshots is expensive; production systems use tiered storage with hot recent data (7 days) and warm historical data (83 days) on cheaper object stores
Dataset snapshots are time stamped and retained alongside feature versions, allowing identical retraining of historical models for counterfactual analysis or audit compliance
📌 Examples
LinkedIn's Venice feature store links each Pro ML model to specific feature schema versions; time travel queries enable reconstructing features served 60 days ago for diagnosing ranking anomalies
Uber's Zipline maintains 90 days of feature lineage; when an Estimated Time of Arrival (ETA) model was rolled back after 3 weeks, on demand backfill regenerated v4 features so the old model could serve current requests without accuracy loss
← Back to Model Versioning & Rollback Overview