Failure Modes, Edge Cases, and Operational Challenges
Missing Historical Data
Lag features require historical values that may be missing: new products have no history, stockouts create zeros that are not true demand, data collection gaps leave holes. Imputation strategies: forward fill (use last known value), interpolation (estimate between known values), model-based imputation, or exclude affected rows from training.
Warning: Do not impute stockout zeros as true demand—this teaches the model that zero demand is normal when it actually reflects supply constraints. Flag stockouts separately and handle explicitly.
Insufficient History for Long Lags
Series with 60 days of history cannot compute lag-365. Cold start strategies: use category-level statistics, borrow from similar series, or use only available lags with regularization to prevent overfitting to short history. Gradually enable longer lags as history accumulates.
Calendar Feature Misalignment
Holiday effects vary by region and year. Easter moves; Thanksgiving is fixed. Using wrong holiday calendar for a region causes systematic errors on those days. Maintain region-specific holiday lists. For floating holidays, encode as "days until holiday" rather than fixed calendar dates.
Debugging Tip: When forecasts fail on specific dates, check calendar alignment first. Missing or incorrect holiday flags cause large errors concentrated on predictable dates.
Feature Computation Delays
Batch pipelines have latency: features computed overnight may not reflect yesterday evening. For high-frequency forecasts, this staleness degrades accuracy. Monitor feature freshness and alert when computation completes late. Consider streaming for features where staleness significantly impacts forecast quality.