What is a Model Registry and Why Production ML Needs It

Definition
A model registry is the single source of truth for ML models in production—tracking every version, its training lineage, approval status, and deployment bindings. It answers: "Which exact model is serving traffic right now?"
THE PROBLEM IT SOLVES
Without a registry: which model version is in production? Was it trained on the right data? Who approved it? Different services load different versions, rollbacks require guessing, compliance audits become archaeology. The registry makes model identity explicit and traceable.
BRIDGING TRAINING AND SERVING
Training produces artifacts (300MB-3GB files). Serving needs to load the right one reliably. The registry bridges this: training registers versions with metadata, evaluators compare against baselines, approved versions promote to staging then production. Serving queries the registry at startup to resolve which version to load.
💡 Insight: The registry is a control plane, not data plane. It stores metadata and pointers, not model files. Binaries live in object storage; the registry tracks which is blessed for production.
SCALE REQUIREMENTS
Production registries handle hundreds of model groups, thousands of versions. Target: p95 metadata lookup under 10ms. During deploys, thousands of instances query simultaneously—caching and read replicas are essential.
⚠️ Trade-off: Adds process overhead and requires metadata discipline. Overkill for single prototype. Essential for organizations with multiple models, compliance, or rollback needs.

💡 Key Takeaways

✓Model registry is the single source of truth for which model version is serving production traffic

✓Bridges training and serving: training registers versions, evaluators compare, approved versions deploy

✓Control plane (metadata and pointers) not data plane—model binaries live in object storage

✓Scale targets: p95 metadata lookup under 10ms, hundreds of writes per hour during retraining

✓Essential for organizations with multiple models, compliance needs, or rollback requirements

📌 Interview Tips

1When asked about model deployment, explain the registry as the bridge between training and serving

2Mention that serving systems query at startup, not on every request—registry is off the hot path

← Back to Model Registry Overview