Model Versioning, Rollout, and Governance

Model Versioning
Production systems run multiple model versions simultaneously during rollouts, A/B tests, and rollbacks. Without careful versioning, you lose reproducibility and the ability to compare results across time.
Version artifacts: Track model weights, training data version, preprocessing code, and hyperparameters together. A model is only reproducible if you can recreate its exact training environment.
Storage: Model weights range from 50MB to 5GB. Store in versioned object storage with immutable identifiers. Never overwrite existing versions.
Safe Rollout Strategies
Canary deployment: Route 1-5% of traffic to the new model. Monitor accuracy and latency metrics. If degradation exceeds thresholds, automatically rollback. Gradually increase traffic over hours or days.
Shadow mode: Run new model in parallel without affecting users. Compare predictions against the current model. Identify disagreements for human review before any traffic switch.
Feature flags: Enable new model per user segment, geography, or content type. Test on low-risk segments first.
Governance and Compliance
Audit trails: Log which model version produced each prediction. Required for debugging, compliance, and legal discovery.
Model cards: Document intended use, known limitations, and evaluation results. Critical for handoffs between teams and regulatory review.
Bias monitoring: Track accuracy across demographic groups if available. Unequal performance across groups indicates fairness issues requiring investigation.
Key Insight: Model governance is not optional overhead. It is the difference between systems you can debug, explain, and improve versus systems that become unmaintainable black boxes within months.

💡 Key Takeaways

✓Version all artifacts together: weights, training data, preprocessing code, and hyperparameters

✓Canary deployment with 1-5% traffic catches regressions before full rollout

✓Shadow mode compares new vs old predictions without user impact - review disagreements before switching

✓Audit trails linking predictions to model versions are required for debugging and compliance

📌 Interview Tips

1Interview Tip: Explain canary metrics - monitor both accuracy (correctness) and latency (performance) with automatic rollback thresholds

2Interview Tip: Mention model cards as documentation practice - shows awareness of responsible ML practices expected at senior levels

← Back to Image Classification at Scale Overview