Feature Drift Detection with PSI and Distribution Metrics

PSI FOR FEATURE DRIFT
Population Stability Index (PSI) compares feature distributions between baseline and current data. It quantifies how much distributions have shifted.
Calculation: bin feature values, compute percentage in each bin for baseline and current, sum (current% - baseline%) × ln(current% / baseline%) across bins.
Interpretation: PSI < 0.1 means negligible shift. PSI 0.1-0.25 means moderate shift worth investigating. PSI > 0.25 means significant shift requiring action.
PSI is symmetric and works for any distribution type. It is the most widely used metric for production drift detection because it is interpretable and comparable across features.
OTHER DISTRIBUTION METRICS
Wasserstein distance: Also called Earth Mover Distance. Measures the minimum cost to transform one distribution into another. More sensitive to shape differences than PSI. No binning required.
KL divergence: Measures information lost when one distribution approximates another. Asymmetric (order matters). Undefined when baseline has zero probability. Use Jensen-Shannon divergence for symmetric version.
K-S statistic: Maximum difference between cumulative distributions. Good for detecting any type of shift. Returns p-value for significance testing.
CHOOSING METRICS
PSI for business reporting and cross-feature comparison. K-S when you need statistical significance. Wasserstein when shape changes matter (e.g., bimodal to unimodal shifts). Jensen-Shannon for embedding comparisons.
Track multiple metrics. Each captures different aspects of drift. PSI might miss certain shape changes that Wasserstein catches. Using multiple metrics provides defense in depth.
⚠️ Key Trade-off: Simple metrics (PSI) are easier to operationalize but miss subtle shifts. Complex metrics catch more but are harder to interpret and threshold.

💡 Key Takeaways

✓PSI: <0.1 negligible, 0.1-0.25 moderate, >0.25 significant; symmetric, interpretable, most widely used

✓Alternatives: Wasserstein for shape sensitivity, K-S for significance testing, Jensen-Shannon for embeddings

✓Use multiple metrics for defense in depth—each captures different aspects of distribution shift

📌 Interview Tips

1Interview Tip: Walk through PSI calculation and explain why thresholds (0.1, 0.25) are industry conventions.

2Interview Tip: Explain when Wasserstein is better than PSI—detecting shape changes like bimodal to unimodal.

← Back to Data Quality Monitoring Overview