Feature Drift Detection with PSI and Distribution Metrics
PSI FOR FEATURE DRIFT
Population Stability Index (PSI) compares feature distributions between baseline and current data. It quantifies how much distributions have shifted.
Calculation: bin feature values, compute percentage in each bin for baseline and current, sum (current% - baseline%) × ln(current% / baseline%) across bins.
Interpretation: PSI < 0.1 means negligible shift. PSI 0.1-0.25 means moderate shift worth investigating. PSI > 0.25 means significant shift requiring action.
PSI is symmetric and works for any distribution type. It is the most widely used metric for production drift detection because it is interpretable and comparable across features.
OTHER DISTRIBUTION METRICS
Wasserstein distance: Also called Earth Mover Distance. Measures the minimum cost to transform one distribution into another. More sensitive to shape differences than PSI. No binning required.
KL divergence: Measures information lost when one distribution approximates another. Asymmetric (order matters). Undefined when baseline has zero probability. Use Jensen-Shannon divergence for symmetric version.
K-S statistic: Maximum difference between cumulative distributions. Good for detecting any type of shift. Returns p-value for significance testing.
CHOOSING METRICS
PSI for business reporting and cross-feature comparison. K-S when you need statistical significance. Wasserstein when shape changes matter (e.g., bimodal to unimodal shifts). Jensen-Shannon for embedding comparisons.
Track multiple metrics. Each captures different aspects of drift. PSI might miss certain shape changes that Wasserstein catches. Using multiple metrics provides defense in depth.