Learn→Time Series Forecasting→Statistical Models (ARIMA, Exponential Smoothing)→2 of 6

Time Series Forecasting • Statistical Models (ARIMA, Exponential Smoothing)Medium⏱️ ~3 min

ARIMA: Autoregression, Differencing, and Moving Averages

AutoRegressive Integrated Moving Average (ARIMA) models forecast future values as a linear combination of lagged observations and lagged errors, after differencing to achieve stationarity. The notation ARIMA(p, d, q) specifies three orders: p for autoregressive lags, d for differencing steps, and q for moving average error lags. Seasonal ARIMA (SARIMA) extends this with seasonal terms: ARIMA(p, d, q)(P, D, Q)m where m is the seasonal period, such as 7 for daily data with weekly patterns or 24 for hourly data with daily cycles.

The modeling workflow requires identifying appropriate orders through diagnostics. First, apply differencing to remove trend and seasonality: order d removes trend, order D removes seasonal patterns. Then inspect the AutoCorrelation Function (ACF) and Partial AutoCorrelation Function (PACF) to choose p and q. For example, if PACF cuts off at lag 2 and ACF decays slowly, try p equals 2. If ACF cuts off at lag 1 and PACF decays, try q equals 1. In practice, retail and traffic forecasting systems typically cap d and D at 1, and limit p and q to 3 or less to avoid overfitting.

ARIMA handles exogenous regressors through ARIMAX formulations, which is valuable for incorporating promotions, pricing, or calendar effects. Maximum likelihood estimation fits coefficients. For production deployment, store coefficients plus recent observations and residuals needed for recursive forecasting. Unlike ETS, each forecast step requires O(p plus q) operations, though this remains fast: 300 to 500 milliseconds per series fit for typical retail applications.

The stationarity requirement is both a strength and weakness. ARIMA explicitly models short term autocorrelation patterns that ETS might miss, making it powerful for series with strong recent memory. However, choosing the right differencing orders is subtle. Over differencing removes signal and inflates forecast variance, while under differencing leaves nonstationarity that violates model assumptions. Automated identification using Akaike Information Criterion corrected (AICc) and rolling origin validation helps, but requires more tuning than ETS.

💡 Key Takeaways

•ARIMA(p, d, q) combines p autoregressive lags, d differencing steps, and q moving average error terms to model stationary time series

•Seasonal ARIMA adds seasonal period m and terms (P, D, Q): SARIMA(p,d,q)(P,D,Q)m where m equals 7 for weekly or 24 for daily cycles

•Stationarity requirement: series must be differenced to remove trend and seasonality before modeling, typically capping d and D at 1 for production systems

•Identification workflow: apply unit root tests and visual diagnostics to choose d and D, then inspect ACF and PACF plots to select p and q, limiting to 3 or less

•ARIMAX extends ARIMA with exogenous regressors like price, promotions, or holidays, enabling causal modeling alongside temporal patterns

•Fit time 300 to 500 milliseconds per series, forecast step O(p plus q) operations: slower than ETS but captures short memory autocorrelation ETS may miss

📌 Examples

Retail demand forecasting: SARIMA(1,1,1)(1,1,1)7 for daily item sales with weekly seasonality, differencing once for trend and once at lag 7 for weekly cycles

Airbnb search traffic: ARIMAX with promotion and holiday regressors to capture marketing spend effects and special event spikes

Lyft zone demand: SARIMA per zone with p and q capped at 2, updating coefficients in batch runs every 30 minutes to feed driver allocation

← Back to Statistical Models (ARIMA, Exponential Smoothing) Overview