What Causes Model Performance Degradation in Production?
ROOT CAUSES
Data drift: Input feature distributions shift from training distribution. User behavior changes, new product categories appear, seasonality effects. The model encounters data it was not trained on.
Concept drift: The relationship between features and outcomes changes. What used to predict success no longer does. A fraud model loses effectiveness as fraudsters adapt their tactics.
Feature decay: Features become less predictive over time. A feature based on last year engagement patterns loses relevance as user behavior evolves.
Upstream failures: Data pipelines fail silently. Features are computed incorrectly or become stale. The model receives garbage inputs and produces garbage outputs.
DEGRADATION TIMELINE
Models typically degrade 1-5% per month without intervention. High-velocity domains (fraud, trending content, pricing) degrade faster—potentially 10-20% per month. Stable domains (document classification, image recognition) degrade slower.
Degradation is often gradual and invisible until significant damage occurs. By the time someone notices predictions are bad, the model may have been underperforming for weeks.
WHY MONITORING MATTERS
Unlike traditional software bugs that crash visibly, ML degradation fails silently. The model continues producing predictions—they are just wrong. Proactive monitoring catches degradation before business impact becomes severe.