Model Monitoring & ObservabilityFeature Importance Tracking (SHAP Drift)Medium⏱️ ~3 min

Model Centric vs Data Centric SHAP Monitoring Patterns

There are two dominant architectural patterns for SHAP drift monitoring, each optimized for different constraints. Model centric monitoring computes SHAP values directly on your production model, tracking statistics like mean absolute SHAP per feature, distributional divergence of SHAP values, or rank order changes in top features. This tells you exactly how your production model's reliance is shifting. Data centric monitoring takes a different approach: train a lightweight domain classifier to separate baseline period samples (labeled 0) from recent period samples (labeled 1), then explain that classifier with SHAP. Features with large SHAP values in the domain classifier are the ones that changed most between periods. The data centric pattern shines when your production model is expensive to explain. If you're running a deep neural network for recommendations that takes 500 milliseconds to compute SHAP per sample, but you need to monitor 50,000 predictions per second, explaining the production model is prohibitive. Instead, train a small gradient boosted tree (100 trees, depth 6) on 24 hour baseline versus last 60 minutes. TreeSHAP on this lightweight classifier completes in under 20 seconds for 20,000 samples on an 8 vCPU worker. The domain classifier AUC tells you how much drift exists overall (AUC near 0.5 means no drift, near 1.0 means severe drift), and SHAP on the classifier ranks which features drove the separation. The tradeoff is subtle but important. The domain classifier can flag features that changed dramatically in distribution but that your production model is invariant to. For example, a timestamp feature appears only in recent data; the domain classifier locks onto it with high SHAP, but your production model never used it. Model centric SHAP wouldn't flag this. Many production teams run both patterns and intersect results: features that appear in both model centric top SHAP changes AND domain classifier top SHAP are the highest priority signals.
💡 Key Takeaways
Model centric computes SHAP on production model to track reliance shifts; data centric trains domain classifier (baseline vs recent) and explains it to isolate distribution changes
Domain classifier approach costs 10x to 25x less compute when production model is expensive to explain (deep networks taking 500ms vs lightweight trees at 20ms)
Domain classifier AUC quantifies drift severity: 0.5 means no drift, 0.6 triggers investigation, above 0.8 indicates severe shift requiring immediate action
Data centric can flag features that changed distribution but production model ignores (proxy features, leakage), creating false positives without model context
Production teams at Meta and Netflix run both patterns and intersect results: features appearing in both lists are highest priority triage candidates
Domain classifier trained on 24 hour baseline versus 60 minute recent window with gradient boosted trees achieves actionable AUC and SHAP rankings in under 20 seconds on 8 vCPU worker
📌 Examples
Consumer marketplace with 18,000 QPS deep model uses domain classifier pattern: gradient boosted tree on 24h baseline vs 60min recent, pages when AUC exceeds 0.6 for two consecutive 15 minute windows
High throughput ranking service with tree model (600 trees, depth 8) uses model centric: computes TreeSHAP on 0.1% sample (30 to 80 QPS), micro batches 3,000 samples per minute, alerts on 25% relative change in mean absolute SHAP
Netflix recommendation system runs both patterns: domain classifier flags "time of day" shift, but model centric shows production model doesn't rely on it, team dismisses alert
← Back to Feature Importance Tracking (SHAP Drift) Overview