Explaining Time Series Classification Predictions via Causal Attributions
This work addresses the need for more reliable explanations in time series classification, highlighting risks in using associational methods for causal inference, though it is incremental as it builds on existing attribution and causal frameworks.
The authors tackled the problem of understanding machine learning model decisions in time series classification by introducing a model-agnostic attribution method based on causal effects, rather than associational relationships, and demonstrated that causal and associational attributions differ in important details across diverse tasks.
Despite the excelling performance of machine learning models, understanding their decisions remains a long-standing goal. Although commonly used attribution methods from explainable AI attempt to address this issue, they typically rely on associational rather than causal relationships. In this study, within the context of time series classification, we introduce a novel model-agnostic attribution method to assess the causal effect of concepts i.e., predefined segments within a time series, on classification outcomes. Our approach compares these causal attributions with closely related associational attributions, both theoretically and empirically. To estimate counterfactual outcomes, we use state-of-the-art diffusion models backed by state space models. We demonstrate the insights gained by our approach for a diverse set of qualitatively different time series classification tasks. Although causal and associational attributions might often share some similarities, in all cases they differ in important details, underscoring the risks associated with drawing causal conclusions from associational data alone. We believe that the proposed approach is also widely applicable in other domains to shed some light on the limits of associational attributions.