ReX: A Framework for Incorporating Temporal Information in Model-Agnostic Local Explanation Techniques
This addresses a limitation in interpretability for machine learning models handling temporal data, offering an incremental improvement over existing techniques.
The paper tackles the problem that existing local model-agnostic explanation techniques are ineffective for models with variable-length inputs by proposing ReX, a framework to incorporate temporal information, which significantly improves explanation fidelity and helps users better understand model behaviors.
Existing local model-agnostic explanation techniques are ineffective for machine learning models that consider inputs of variable lengths, as they do not consider temporal information embedded in these models. To address this limitation, we propose \textsc{ReX}, a general framework for incorporating temporal information in these techniques. Our key insight is that these techniques typically learn a model surrogate by sampling model inputs and outputs, and we can incorporate temporal information in a uniform way by only changing the sampling process and the surrogate features. We instantiate our approach on three popular explanation techniques: Anchors, LIME, and Kernel SHAP. To evaluate the effectiveness of \textsc{ReX}, we apply our approach to six models in three different tasks. Our evaluation results demonstrate that our approach 1) significantly improves the fidelity of explanations, making model-agnostic techniques outperform a state-of-the-art model-specific technique on its target model, and 2) helps end users better understand the models' behaviors.