CVLGAug 7, 2019

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks

arXiv:1908.02686v1131 citations
AI Analysis

This work addresses the need for interpretability in neural networks for researchers and practitioners, though it appears incremental as it builds on existing post-hoc explanation methods with a novel defense technique.

The authors tackled the problem of generating interpretable visual explanations for convolutional neural networks by proposing a post-hoc optimization method that defends against adversarial evidence through gradient filtering, resulting in fine-grained explanations that preserve image characteristics and are valid model inputs.

To verify and validate networks, it is essential to gain insight into their decisions, limitations as well as possible shortcomings of training data. In this work, we propose a post-hoc, optimization based visual explanation method, which highlights the evidence in the input image for a specific prediction. Our approach is based on a novel technique to defend against adversarial evidence (i.e. faulty evidence due to artefacts) by filtering gradients during optimization. The defense does not depend on human-tuned parameters. It enables explanations which are both fine-grained and preserve the characteristics of images, such as edges and colors. The explanations are interpretable, suited for visualizing detailed evidence and can be tested as they are valid model inputs. We qualitatively and quantitatively evaluate our approach on a multitude of models and datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes