Time Series ForecastingModel Evaluation (MAPE, RMSE, Forecast Bias)Medium⏱️ ~3 min

How to Build a Production Metric Suite for Forecast Evaluation

Core Metric Suite

No single metric captures all forecast quality aspects. Production systems need multiple metrics: MAPE or sMAPE for percentage accuracy, RMSE for absolute accuracy with large-error penalty, Bias for systematic directional error, MASE for comparison against naive baseline. Report all four; optimize for the one most aligned with business cost.

MASE (Mean Absolute Scaled Error): MASE compares error to naive forecast error: MASE = MAE / MAE_naive. MASE < 1 means model beats naive; MASE > 1 means model loses to naive. MASE is scale-independent and works with zeros, avoiding MAPE limitations.

Segmented Metrics

Aggregate metrics hide problems. A model with 5% MAPE overall may have 3% on high-volume products and 40% on low-volume products. Segment by: volume tier (high/medium/low), product age (established/new), volatility (stable/variable), business importance. Identify segments where forecast quality is unacceptable.

Horizon-Specific Metrics

Forecast accuracy degrades with horizon. Report metrics at each horizon: 1-day MAPE, 7-day MAPE, 30-day MAPE. Stakeholders consuming different horizons need appropriate expectations. If 7-day forecasts are used for inventory and 30-day for planning, both horizon metrics matter.

Baseline Comparison: Always compare against baselines: naive (last observation), seasonal naive (same period last year), simple moving average. If your complex model only marginally beats naive, the complexity may not be justified.

Business Metrics

Beyond statistical metrics, track business impact: inventory turns, stockout rate, markdown rate, service level achieved. These connect forecast accuracy to business outcomes. A 5% MAPE improvement that reduces stockouts by 20% is more compelling than abstract accuracy gains.

💡 Key Takeaways
Core suite: MAPE/sMAPE (percentage), RMSE (absolute), Bias (direction), MASE (vs naive baseline)
Segment metrics by volume tier, product age, volatility—aggregate metrics hide segment problems
Connect to business metrics: stockout rate, inventory turns, service level achieved
📌 Interview Tips
1MASE < 1 means model beats naive; MASE > 1 means model loses—scale-independent and works with zeros
25% aggregate MAPE may hide 3% on high-volume and 40% on low-volume products
← Back to Model Evaluation (MAPE, RMSE, Forecast Bias) Overview