Model Monitoring & ObservabilityData Quality MonitoringMedium⏱️ ~3 min

Feature Drift Detection with PSI and Distribution Metrics

PSI FOR FEATURE DRIFT

Population Stability Index (PSI) compares feature distributions between baseline and current data. It quantifies how much distributions have shifted.

Calculation: bin feature values, compute percentage in each bin for baseline and current, sum (current% - baseline%) × ln(current% / baseline%) across bins.

Interpretation: PSI < 0.1 means negligible shift. PSI 0.1-0.25 means moderate shift worth investigating. PSI > 0.25 means significant shift requiring action.

PSI is symmetric and works for any distribution type. It is the most widely used metric for production drift detection because it is interpretable and comparable across features.

OTHER DISTRIBUTION METRICS

Wasserstein distance: Also called Earth Mover Distance. Measures the minimum cost to transform one distribution into another. More sensitive to shape differences than PSI. No binning required.

KL divergence: Measures information lost when one distribution approximates another. Asymmetric (order matters). Undefined when baseline has zero probability. Use Jensen-Shannon divergence for symmetric version.

K-S statistic: Maximum difference between cumulative distributions. Good for detecting any type of shift. Returns p-value for significance testing.

CHOOSING METRICS

PSI for business reporting and cross-feature comparison. K-S when you need statistical significance. Wasserstein when shape changes matter (e.g., bimodal to unimodal shifts). Jensen-Shannon for embedding comparisons.

Track multiple metrics. Each captures different aspects of drift. PSI might miss certain shape changes that Wasserstein catches. Using multiple metrics provides defense in depth.

⚠️ Key Trade-off: Simple metrics (PSI) are easier to operationalize but miss subtle shifts. Complex metrics catch more but are harder to interpret and threshold.
💡 Key Takeaways
PSI: <0.1 negligible, 0.1-0.25 moderate, >0.25 significant; symmetric, interpretable, most widely used
Alternatives: Wasserstein for shape sensitivity, K-S for significance testing, Jensen-Shannon for embeddings
Use multiple metrics for defense in depth—each captures different aspects of distribution shift
📌 Interview Tips
1Interview Tip: Walk through PSI calculation and explain why thresholds (0.1, 0.25) are industry conventions.
2Interview Tip: Explain when Wasserstein is better than PSI—detecting shape changes like bimodal to unimodal.
← Back to Data Quality Monitoring Overview
Feature Drift Detection with PSI and Distribution Metrics | Data Quality Monitoring - System Overflow