PAX-TS: Model-agnostic multi-granular explanations for time series forecasting via localized perturbations
This provides interpretability for time series forecasting models, which is crucial for practitioners needing to understand and trust forecasts, though it is incremental as it builds on existing perturbation-based methods.
The paper tackles the problem of explaining opaque time series forecasting models by proposing PAX-TS, a model-agnostic post-hoc algorithm that uses localized perturbations to generate multi-granular explanations, and it demonstrates this on 7 algorithms and 10 datasets, finding that explanations differ between high- and low-performing models and identifying 6 performance-indicating patterns.
Time series forecasting has seen considerable improvement during the last years, with transformer models and large language models driving advancements of the state of the art. Modern forecasting models are generally opaque and do not provide explanations for their forecasts, while well-known post-hoc explainability methods like LIME are not suitable for the forecasting context. We propose PAX-TS, a model-agnostic post-hoc algorithm to explain time series forecasting models and their forecasts. Our method is based on localized input perturbations and results in multi-granular explanations. Further, it is able to characterize cross-channel correlations for multivariate time series forecasts. We clearly outline the algorithmic procedure behind PAX-TS, demonstrate it on a benchmark with 7 algorithms and 10 diverse datasets, compare it with two other state-of-the-art explanation algorithms, and present the different explanation types of the method. We found that the explanations of high-performing and low-performing algorithms differ on the same datasets, highlighting that the explanations of PAX-TS effectively capture a model's behavior. Based on time step correlation matrices resulting from the benchmark, we identify 6 classes of patterns that repeatedly occur across different datasets and algorithms. We found that the patterns are indicators of performance, with noticeable differences in forecasting error between the classes. Lastly, we outline a multivariate example where PAX-TS demonstrates how the forecasting model takes cross-channel correlations into account. With PAX-TS, time series forecasting models' mechanisms can be illustrated in different levels of detail, and its explanations can be used to answer practical questions on forecasts.