CVAILGApr 6, 2021

White Box Methods for Explanations of Convolutional Neural Networks in Image Classification Tasks

arXiv:2104.02548v223 citations
AI Analysis

This work addresses the problem of interpretability in deep learning for researchers and practitioners, but it is incremental as it organizes existing methods rather than introducing new ones.

The paper tackles the lack of transparency in Convolutional Neural Networks (CNNs) for image classification by proposing a topology and classification of white box explanation methods, providing a comprehensive overview to help researchers compare and select methods for generating pixel-level importance maps.

In recent years, deep learning has become prevalent to solve applications from multiple domains. Convolutional Neural Networks (CNNs) particularly have demonstrated state of the art performance for the task of image classification. However, the decisions made by these networks are not transparent and cannot be directly interpreted by a human. Several approaches have been proposed to explain to understand the reasoning behind a prediction made by a network. In this paper, we propose a topology of grouping these methods based on their assumptions and implementations. We focus primarily on white box methods that leverage the information of the internal architecture of a network to explain its decision. Given the task of image classification and a trained CNN, this work aims to provide a comprehensive and detailed overview of a set of methods that can be used to create explanation maps for a particular image, that assign an importance score to each pixel of the image based on its contribution to the decision of the network. We also propose a further classification of the white box methods based on their implementations to enable better comparisons and help researchers find methods best suited for different scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes