Model Monitoring & ObservabilityConcept Drift & Model DecayEasy⏱️ ~2 min

What is Concept Drift vs Data Drift vs Model Decay?

Concept drift occurs when the relationship between inputs and the target changes over time. The conditional distribution P(y|X) shifts, meaning patterns the model learned during training no longer hold in production. For example, user preferences for movie genres might shift after a major cultural event, so the same features now predict different outcomes. Data drift is different. It refers to changes in the input feature distribution P(X) itself, not the relationship to the target. Your model might still work correctly on the new distribution, but the inputs look different than training data. For instance, if new smartphone models flood the market, your device feature distribution shifts even if user behavior patterns remain stable. Model decay is the observable consequence. It's the measurable loss of predictive performance in production resulting from one or both types of drift. At Netflix with 250M+ members generating billions of events daily, trend drift appears after content releases. At Uber, traffic patterns shift by city and hour, causing real time travel time residuals to diverge from baseline. The key insight is that decay is what you measure (like Precision@10 dropping from 0.85 to 0.78), while drift is the underlying cause you must diagnose.
💡 Key Takeaways
Concept drift means P(y|X) changes. The same inputs now predict different outputs. This requires retraining because your learned patterns are obsolete.
Data drift means P(X) changes. Input distribution shifts but the relationship to target might still hold. Monitor with Population Stability Index (PSI), where 0.1 flags concern, 0.2 is moderate, 0.3 is severe.
Model decay is the measurable performance loss. At Google and Meta ads platforms scoring 100k to 1M predictions per second, Click Through Rate (CTR) model Area Under Curve (AUC) drops of 5 to 10% trigger retraining.
Drift patterns include sudden (regulatory change), gradual (seasonal preference shift), incremental (daily small changes accumulating), and recurring (weekday vs weekend behavior at Netflix).
Label latency determines detection speed. Clicks arrive in minutes, purchases in hours, chargebacks in weeks. Stripe fraud models use proxy labels within 1 to 24 hours, then delayed correction jobs weeks later.
Both drifts can occur together. A product redesign changes feature distributions (data drift) and user interaction patterns (concept drift) simultaneously, compounding model decay.
📌 Examples
Netflix personalization: After a major series release, viewing patterns shift within hours (concept drift). The system increases exploration rates for new titles for 30 to 120 minutes to learn fresh click and play patterns.
Uber ETA prediction: Real time traffic conditions change road segment speed distributions by hour and city (data drift). Monitors compare recent travel time residuals against a rolling 7 day baseline, triggering recalibration when Mean Absolute Error (MAE) worsens by more than 10% for 30 minutes.
Stripe fraud detection: Attackers change tactics within hours (concept drift). Short horizon models retrain hourly during attacks with aggressive weights on recent data, while long horizon models prevent overfitting to attacker noise.
← Back to Concept Drift & Model Decay Overview