CVJan 17, 2023

Opti-CAM: Optimizing saliency maps for interpretability

arXiv:2301.07002v348 citationsh-index: 43
Originality Incremental advance
AI Analysis

This work addresses interpretability for machine learning practitioners, but it is incremental as it combines existing CAM-based and masking-based approaches.

The authors tackled the problem of improving interpretability in convolutional neural networks by optimizing saliency maps, resulting in Opti-CAM outperforming other CAM-based approaches on several datasets according to classification metrics.

Methods based on class activation maps (CAM) provide a simple mechanism to interpret predictions of convolutional neural networks by using linear combinations of feature maps as saliency maps. By contrast, masking-based methods optimize a saliency map directly in the image space or learn it by training another network on additional data. In this work we introduce Opti-CAM, combining ideas from CAM-based and masking-based approaches. Our saliency map is a linear combination of feature maps, where weights are optimized per image such that the logit of the masked image for a given class is maximized. We also fix a fundamental flaw in two of the most common evaluation metrics of attribution methods. On several datasets, Opti-CAM largely outperforms other CAM-based approaches according to the most relevant classification metrics. We provide empirical evidence supporting that localization and classifier interpretability are not necessarily aligned.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes