LGAICVMar 15, 2023

EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust and Non-Robust Models

arXiv:2303.08866v113 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses the lack of consensus in evaluating attribution methods for explainable AI, particularly in medical imaging classification, but is incremental as it builds on existing evaluation frameworks.

The paper tackled the problem of evaluating attribution maps in robust and non-robust models, proposing a new faithfulness metric (EvalAttAI) that addresses limitations in existing metrics like Deletion and Insertion, and found that Bayesian deep neural networks with Variational Density Propagation were consistently more explainable when paired with Vanilla Gradient, though robust models generally did not show increased explainability despite producing more visually plausible maps.

The expansion of explainable artificial intelligence as a field of research has generated numerous methods of visualizing and understanding the black box of a machine learning model. Attribution maps are generally used to highlight the parts of the input image that influence the model to make a specific decision. On the other hand, the robustness of machine learning models to natural noise and adversarial attacks is also being actively explored. This paper focuses on evaluating methods of attribution mapping to find whether robust neural networks are more explainable. We explore this problem within the application of classification for medical imaging. Explainability research is at an impasse. There are many methods of attribution mapping, but no current consensus on how to evaluate them and determine the ones that are the best. Our experiments on multiple datasets (natural and medical imaging) and various attribution methods reveal that two popular evaluation metrics, Deletion and Insertion, have inherent limitations and yield contradictory results. We propose a new explainability faithfulness metric (called EvalAttAI) that addresses the limitations of prior metrics. Using our novel evaluation, we found that Bayesian deep neural networks using the Variational Density Propagation technique were consistently more explainable when used with the best performing attribution method, the Vanilla Gradient. However, in general, various types of robust neural networks may not be more explainable, despite these models producing more visually plausible attribution maps.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes