AILGDec 31, 2020

Quantitative Evaluations on Saliency Methods: An Experimental Study

arXiv:2012.15616v124 citations
AI Analysis

This work provides a comprehensive experimental study of existing saliency method metrics, which is important for researchers and practitioners in eXplainable AI to understand the strengths and weaknesses of different methods.

This paper conducts an exhaustive experimental study of various saliency methods using metrics such as faithfulness, localization, false-positives, sensitivity, and stability. The study concludes that no single explanation method dominates across all metrics, though Grad-CAM and RISE perform well in most.

It has been long debated that eXplainable AI (XAI) is an important topic, but it lacks rigorous definition and fair metrics. In this paper, we briefly summarize the status quo of the metrics, along with an exhaustive experimental study based on them, including faithfulness, localization, false-positives, sensitivity check, and stability. With the experimental results, we conclude that among all the methods we compare, no single explanation method dominates others in all metrics. Nonetheless, Gradient-weighted Class Activation Mapping (Grad-CAM) and Randomly Input Sampling for Explanation (RISE) perform fairly well in most of the metrics. Utilizing a set of filtered metrics, we further present a case study to diagnose the classification bases for models. While providing a comprehensive experimental study of metrics, we also examine measuring factors that are missed in current metrics and hope this valuable work could serve as a guide for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes