Model Centric vs Data Centric SHAP Monitoring Patterns
MODEL-CENTRIC MONITORING
Model-centric SHAP monitoring computes SHAP values for production predictions and tracks how the model uses features over time. The model is fixed; you observe how changing inputs affect feature contributions.
Implementation: Sample N predictions per time window (hourly, daily). Compute SHAP values for each sample. Aggregate mean absolute SHAP per feature. Compare against baseline to detect drift.
Advantages: Direct insight into model behavior. Catches subtle changes in how features interact with the model.
Disadvantages: Expensive computation. SHAP for 1000 samples might take minutes. Does not distinguish whether drift is due to data change or something else.
DATA-CENTRIC MONITORING
Data-centric approach monitors feature distributions and infers importance changes from data shifts. If feature A distribution shifts significantly and feature A has high baseline importance, overall SHAP distribution likely shifted.
Implementation: Track feature distribution drift (PSI, K-S). Weight by baseline feature importance. High-importance features with high drift indicate likely SHAP drift.
Advantages: Much cheaper than computing SHAP. Scales to high-volume systems. Good for initial screening.
Disadvantages: Indirect signal. Does not capture feature interactions. May miss cases where distribution is stable but model uses feature differently due to interaction effects.
HYBRID APPROACH
Use data-centric monitoring as a cheap filter. When it detects potential drift, trigger more expensive model-centric SHAP computation to confirm.
Alert flow: data drift detected on high-importance feature → compute SHAP on recent sample → compare to baseline SHAP → alert if confirmed.