Model Monitoring & ObservabilityFeature Importance Tracking (SHAP Drift)Hard⏱️ ~3 min

SHAP Drift Failure Modes and Mitigation Strategies

SAMPLING BIAS

If your sample is not representative, SHAP drift analysis will be misleading. A sample dominated by one user segment may show stable importance while another segment experiences significant drift.

Detection: Compare sample composition to traffic composition. Ensure segments are proportionally represented.

Mitigation: Use stratified sampling. Ensure minimum samples per key segment. Report SHAP drift per segment, not just aggregate.

BASELINE DRIFT

SHAP drift is relative to baseline. If the baseline itself was computed during an anomalous period (holiday spike, bug), drift detection will be systematically wrong.

Detection: Track baseline creation date and conditions. Periodically validate baseline still represents "normal" behavior.

Mitigation: Use multiple baselines: original training baseline plus recent rolling baseline. Alert when drift is detected relative to both—this filters out baseline issues.

FALSE POSITIVES FROM NATURAL VARIATION

SHAP values vary naturally across samples. Small samples have high variance. Apparent drift may be random noise, not real change.

Detection: Track confidence intervals. Use statistical tests (t-test on SHAP distributions) rather than just comparing means.

Mitigation: Increase sample size for reliable detection. Set alert thresholds based on historical variance, not fixed values. Require drift to persist across multiple time windows before alerting.

CORRELATED FEATURE ISSUES

SHAP distributes importance across correlated features. If two features are highly correlated, importance may shift between them without meaningful change in model behavior.

Mitigation: Group correlated features and track aggregate importance. Focus on groups rather than individual features for drift detection.

⚠️ Key Trade-off: More sophisticated analysis reduces false positives but increases complexity. Start simple, add sophistication when false positive rate becomes a problem.
💡 Key Takeaways
Sampling bias: unrepresentative samples miss segment-specific drift; use stratified sampling with minimum per segment
Baseline drift: anomalous baseline periods cause systematic errors; use multiple baselines (training + rolling)
Natural variation: small samples have high variance; use statistical tests, require persistence across multiple windows
📌 Interview Tips
1Interview Tip: Explain sampling bias—segment-specific drift masked by unrepresentative samples.
2Interview Tip: Describe handling correlated features: group them and track aggregate importance.
← Back to Feature Importance Tracking (SHAP Drift) Overview