LGAINov 30, 2020

TimeSHAP: Explaining Recurrent Models through Sequence Perturbations

arXiv:2012.00073v2148 citations
AI Analysis

This work addresses the problem of explaining predictions from recurrent neural networks, which is crucial for understanding and debugging state-of-the-art sequential decision-making models, especially in high-stakes applications like fraud detection. It is an incremental contribution to the field of explainable AI.

This paper introduces TimeSHAP, a model-agnostic recurrent explainer that extends KernelSHAP to sequential data, providing feature-, timestep-, and cell-level attributions. It also proposes a pruning method that significantly reduces computational cost and variance. Applied to a bank account takeover fraud detection RNN, TimeSHAP revealed that positive predicted sequences can be pruned to 10% of their original length, and the most recent input event contributes on average only 41% to the model's score.

Although recurrent neural networks (RNNs) are state-of-the-art in numerous sequential decision-making tasks, there has been little research on explaining their predictions. In this work, we present TimeSHAP, a model-agnostic recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes feature-, timestep-, and cell-level attributions. As sequences may be arbitrarily long, we further propose a pruning method that is shown to dramatically decrease both its computational cost and the variance of its attributions. We use TimeSHAP to explain the predictions of a real-world bank account takeover fraud detection RNN model, and draw key insights from its explanations: i) the model identifies important features and events aligned with what fraud analysts consider cues for account takeover; ii) positive predicted sequences can be pruned to only 10% of the original length, as older events have residual attribution values; iii) the most recent input event of positive predictions only contributes on average to 41% of the model's score; iv) notably high attribution to client's age, suggesting a potential discriminatory reasoning, later confirmed as higher false positive rates for older clients.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes