Statistical Metrics for Prediction Drift Detection

DISTRIBUTIONAL DISTANCE METRICS
PSI (Population Stability Index): Same metric used for data drift. Compare baseline prediction distribution to current distribution. PSI > 0.25 indicates significant prediction drift. Works well because it is interpretable and has established thresholds.
Jensen-Shannon Divergence: Symmetric measure of similarity between distributions. Bounded between 0 (identical) and 1 (completely different). More sensitive to tail changes than PSI. Useful when extreme predictions matter.
Wasserstein Distance: Measures the minimum cost to transform one distribution into another. Captures shape changes that PSI might miss. More computationally expensive.
SUMMARY STATISTICS
Simpler than full distributional metrics but less sensitive:
Mean shift: Track prediction mean over time. Alert on significant deviation. Simple and fast but misses distribution shape changes.
Variance change: Track prediction variance. Narrowing variance may indicate model is over-confident. Widening may indicate uncertainty.
Percentile monitoring: Track p10, p50, p90 separately. Captures both central tendency and tail behavior. Alert when any percentile shifts beyond threshold.
CHOOSING METRICS
Start with PSI for its interpretability. Add percentile monitoring for tail sensitivity. Use JS divergence or Wasserstein only if simpler metrics miss important drift in your domain.
Multi-metric monitoring provides defense in depth. Different metrics catch different types of drift. Use 2-3 complementary metrics rather than relying on one.
When To Use: PSI for standard monitoring and reporting. JS divergence when tails matter (rare events). Wasserstein for shape-sensitive detection. Percentiles for interpretable alerting.

💡 Key Takeaways

✓PSI: interpretable, established thresholds (>0.25); JS divergence: tail-sensitive, bounded 0-1; Wasserstein: shape-sensitive

✓Summary stats: mean shift (simple), variance change (confidence signals), percentile monitoring (tail behavior)

✓Use 2-3 complementary metrics for defense in depth; different metrics catch different drift types

📌 Interview Tips

1Interview Tip: Compare PSI vs JS divergence: PSI for standard cases, JS when extreme predictions matter.

2Interview Tip: Explain percentile monitoring: tracking p10/p50/p90 captures both center and tails.

← Back to Prediction Drift Monitoring Overview