The Weighting Game: Evaluating Quality of Explainability Methods
This work addresses the need for better evaluation of explainability methods in computer vision, though it is incremental as it builds on existing metrics.
The paper tackles the problem of evaluating explanation heatmaps for image classification by introducing two new metrics: the Weighting Game to measure alignment with class segmentation masks, and a stability metric using transformations. The results quantitatively assess CAM methods and show that model architecture affects explanation quality.
The objective of this paper is to assess the quality of explanation heatmaps for image classification tasks. To assess the quality of explainability methods, we approach the task through the lens of accuracy and stability. In this work, we make the following contributions. Firstly, we introduce the Weighting Game, which measures how much of a class-guided explanation is contained within the correct class' segmentation mask. Secondly, we introduce a metric for explanation stability, using zooming/panning transformations to measure differences between saliency maps with similar contents. Quantitative experiments are produced, using these new metrics, to evaluate the quality of explanations provided by commonly used CAM methods. The quality of explanations is also contrasted between different model architectures, with findings highlighting the need to consider model architecture when choosing an explainability method.