LGCVMLSep 30, 2019

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

arXiv:1909.13584v4252 citations
Originality Incremental advance
AI Analysis

This addresses the issue for practitioners who need actionable insights from explainable AI methods, though it is incremental as it builds on existing explanation techniques.

The paper tackles the problem of deep learning models incorrectly assigning feature importance, proposing contextual decomposition explanation penalization (CDEP) to correct these errors by regularizing explanations, which increased performance on toy and real datasets.

For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective. Too often, the litany of proposed explainable deep learning methods stop at the first step, providing practitioners with insight into a model, but no way to act on it. In this paper, we propose contextual decomposition explanation penalization (CDEP), a method which enables practitioners to leverage existing explanation methods in order to increase the predictive accuracy of deep learning models. In particular, when shown that a model has incorrectly assigned importance to some features, CDEP enables practitioners to correct these errors by directly regularizing the provided explanations. Using explanations provided by contextual decomposition (CD) (Murdoch et al., 2018), we demonstrate the ability of our method to increase performance on an array of toy and real datasets.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes