CVAILGMay 23, 2022

What You See is What You Classify: Black Box Attributions

arXiv:2205.11266v214 citationsh-index: 35
AI Analysis

This addresses the challenge of interpreting black-box deep learning models for computer vision, offering a more efficient and precise attribution method.

The paper tackles the problem of explaining deep image classifiers by identifying image regions that contribute to class scores, proposing a method that trains a second network to predict sharp, class-specific masks, and shows superior results on PASCAL VOC-2007 and Microsoft COCO-2014 datasets.

An important step towards explaining deep image classifiers lies in the identification of image regions that contribute to individual class scores in the model's output. However, doing this accurately is a difficult task due to the black-box nature of such networks. Most existing approaches find such attributions either using activations and gradients or by repeatedly perturbing the input. We instead address this challenge by training a second deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. These attributions are provided in the form of masks that only show the classifier-relevant parts of an image, masking out the rest. Our approach produces sharper and more boundary-precise masks when compared to the saliency maps generated by other methods. Moreover, unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks in a single forward pass. This makes the proposed method very efficient during inference. We show that our attributions are superior to established methods both visually and quantitatively with respect to the PASCAL VOC-2007 and Microsoft COCO-2014 datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes