Feature Versioning and Time Travel for Reproducible Rollback
Why Version Features
Feature versioning treats feature definitions, transformations, and schemas as immutable versioned artifacts alongside model weights. Every feature has a version identifier linking its computation logic, dependencies, and data sources. When a model trains on feature set v5, that model's manifest records v5 explicitly. This prevents the silent failures that occur when feature computation changes but models are not retrained.
Time Travel Capability
Time travel allows point in time consistent reads: querying the feature store for user 12345 as of timestamp T returns exactly the feature values that existed at T, regardless of later updates or backfills. This is not trivial to implement at scale because feature stores typically overwrite values on each update. Implementations use event sourcing (storing all value changes as a log) or snapshot isolation (periodic consistent snapshots with delta storage).
Critical Rollback Scenarios
Time travel enables two critical rollback scenarios. First, forensic debugging: when a prediction anomaly occurs, engineers can reconstruct the exact inputs the model received by time traveling to the request timestamp and comparing against expected distributions. Second, safe model reversion: rolling back to a 30 day old model requires features computed with 30 day old logic. If the feature store retains 90 days of versioned definitions and supports on demand backfill, the old model can serve current requests with compatible features.
Storage and Retention Trade-offs
Netflix and LinkedIn maintain feature lineage linking each model version to specific feature schema versions. Dataset snapshots are also time stamped and retained, allowing retraining historical models identically. The storage cost is nontrivial: at Uber's scale (petabytes of feature data), retaining 90 days of point in time snapshots requires data lifecycle policies and compaction strategies. Production best practice balances cost versus reproducibility with tiered storage: hot recent data on fast storage (7 to 14 days), warm historical data on cheaper object stores (30 to 90 days). Shorter retention reduces storage costs but limits rollback windows and forensic capabilities.