CVSep 23, 2020

Information-Theoretic Visual Explanation for Black-Box Classifiers

arXiv:2009.11150v28 citationsHas Code
AI Analysis

This work addresses the need for interpretable AI by providing more accurate visual explanations for black-box classifiers, which is incremental as it builds on existing attribution methods.

The paper tackles the problem of explaining predictions from black-box classifiers by proposing an information-theoretic method that uses information gain and point-wise mutual information to generate attribution maps, resulting in improved correctness as measured by a quantitative metric.

In this work, we attempt to explain the prediction of any black-box classifier from an information-theoretic perspective. For each input feature, we compare the classifier outputs with and without that feature using two information-theoretic metrics. Accordingly, we obtain two attribution maps--an information gain (IG) map and a point-wise mutual information (PMI) map. IG map provides a class-independent answer to "How informative is each pixel?", and PMI map offers a class-specific explanation of "How much does each pixel support a specific class?" Compared to existing methods, our method improves the correctness of the attribution maps in terms of a quantitative metric. We also provide a detailed analysis of an ImageNet classifier using the proposed method, and the code is available online.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes