Production Architecture for Statistical Models
BATCH TRAINING PIPELINE
Flow: ingest data daily → fit one model per series → store parameters and forecasts. Fitting one model: 10-100ms. For 1M series on 100 workers: ~20 minutes total. Bottleneck is data I/O, not model fitting.
MODEL STORAGE
Store parameters, not raw data. ETS needs ~50 bytes/series. For 10M series, storage is ~500MB. Pre-compute forecasts for next N periods; store in key-value store for sub-millisecond serving.
FRESHNESS VS COST
Pre-computed forecasts get stale as new data arrives. Daily retraining is typical. Hourly costs 24x more. Some retrain only when drift detected (accuracy drops below threshold).
AUTO MODEL SELECTION
Not all series need the same model. Per-series selection: fit ETS, ARIMA, baseline. Pick winner by cross-validation error. Unpredictable series fall back to baseline when complex models do not help.