Modeling Strategies: Recursive vs Direct Multi-Output vs Per-Horizon Models
Recursive (Iterated) Strategy
Train a single-step model, then iterate: predict t+1, feed prediction as input to predict t+2, repeat. Advantages: one model handles any horizon, captures autoregressive dependencies. Disadvantages: errors compound—small errors at t+1 amplify by t+30. Works well for short horizons; degrades rapidly for long horizons.
Error Accumulation: If single-step error is 5%, after 10 recursive steps the effective error can exceed 50%. Use scheduled sampling during training—gradually replace ground truth with predictions—to make the model robust to its own errors.
Direct Multi-Output Strategy
Train one model that outputs all horizons simultaneously. The model learns to predict [t+1, t+2, ..., t+H] in a single forward pass. Advantages: no error accumulation, horizon-specific patterns captured. Disadvantages: fixed maximum horizon, computationally expensive for many horizons. Works well when horizon count is moderate (under 30).
Per-Horizon Strategy
Train separate models for each horizon: model_1 predicts t+1, model_7 predicts t+7, model_30 predicts t+30. Advantages: each model optimizes for its specific horizon. Disadvantages: computational cost scales with horizon count, forecasts may be inconsistent across horizons. Use when horizons have very different characteristics.
Practical Recommendation: Start with direct multi-output for moderate horizons (up to 30 steps). Use recursive for very long horizons where multi-output is impractical. Reserve per-horizon for specialized applications.
Hybrid Approaches
Combine strategies: use direct for short horizons (high accuracy needed), recursive for long horizons (flexibility needed). Some architectures naturally support both: encoder-decoder models can decode either autoregressively or in parallel.