Interpreting Time Series Forecasts with LIME and SHAP: A Case Study on the Air Passengers Dataset
This work addresses interpretability in time-series forecasting for domains like aviation and retail, but it is incremental as it adapts existing explainability methods to a specific context.
The paper tackled the problem of interpreting time-series forecasts by applying LIME and SHAP to a gradient-boosted tree model on the Air Passengers dataset, showing that a small set of lagged features and seasonal encodings explain most forecast variance.
Time-series forecasting underpins critical decisions across aviation, energy, retail and health. Classical autoregressive integrated moving average (ARIMA) models offer interpretability via coefficients but struggle with nonlinearities, whereas tree-based machine-learning models such as XGBoost deliver high accuracy but are often opaque. This paper presents a unified framework for interpreting time-series forecasts using local interpretable model-agnostic explanations (LIME) and SHapley additive exPlanations (SHAP). We convert a univariate series into a leakage-free supervised learning problem, train a gradient-boosted tree alongside an ARIMA baseline and apply post-hoc explainability. Using the Air Passengers dataset as a case study, we show that a small set of lagged features -- particularly the twelve-month lag -- and seasonal encodings explain most forecast variance. We contribute: (i) a methodology for applying LIME and SHAP to time series without violating chronology; (ii) theoretical exposition of the underlying algorithms; (iii) empirical evaluation with extensive analysis; and (iv) guidelines for practitioners.