What is Concept Drift vs Data Drift vs Model Decay?

Key Distinction
Concept drift means the relationship between inputs and outputs changes. Data drift means input distributions change. Model decay is the observed performance degradation caused by either.
DATA DRIFT: INPUT DISTRIBUTION SHIFTS
Data drift occurs when the statistical properties of input features change over time. The model was trained on one distribution but now receives data from a different distribution. Examples: user demographics shift, new product categories emerge, or seasonal patterns change.
Data drift can occur without concept drift. If feature distributions shift but the relationship between features and outcomes remains stable, a well-generalized model may continue to perform well. However, most models overfit to training distributions and degrade when inputs shift.
CONCEPT DRIFT: THE RULES CHANGE
Concept drift occurs when the underlying relationship between inputs and outputs changes. The mapping P(Y|X) shifts. A fraud model trained when fraudsters used method A becomes less effective when they switch to method B—the input features might look similar, but they now indicate different outcomes.
Concept drift is harder to detect than data drift because you cannot measure it directly from inputs alone. You need labeled outcomes to observe that predictions no longer match reality, and labels often arrive with significant delay.
MODEL DECAY: THE OBSERVABLE SYMPTOM
Model decay is performance degradation over time. It is the symptom, not the cause. Decay might result from data drift, concept drift, or both. Tracking decay metrics (accuracy, AUC, business metrics) tells you something is wrong but not what.
Typical decay timeline: models degrade 1-5% per month without intervention. High-velocity domains (fraud, trending content) decay faster. Stable domains (document classification) decay slower.
💡 Key Insight: Distinguish cause from symptom. Data drift and concept drift are causes. Model decay is what you observe. Knowing which drift type occurred determines your response.

💡 Key Takeaways

✓Data drift: input distribution shifts (P(X) changes); concept drift: relationship shifts (P(Y|X) changes)

✓Data drift is detectable from inputs; concept drift requires labeled outcomes with potentially delayed feedback

✓Model decay is the symptom (1-5% per month typical); data or concept drift is the cause

📌 Interview Tips

1Interview Tip: Clearly distinguish the three terms—data drift affects inputs, concept drift affects the mapping, decay is observed performance loss.

2Interview Tip: Give a fraud example—fraudsters changing tactics is concept drift; new user demographics is data drift.

← Back to Concept Drift & Model Decay Overview