AI LGDec 31, 2020

Quantitative Evaluations on Saliency Methods: An Experimental Study

Xiao-Hui Li, Yuhan Shi, Haoyang Li, Wei Bai, Yuanwei Song, Caleb Chen Cao, Lei Chen

arXiv:2012.15616v114.524 citations

Originality Synthesis-oriented

AI Analysis

This work provides a comprehensive experimental study of existing saliency method metrics, which is important for researchers and practitioners in eXplainable AI to understand the strengths and weaknesses of different methods.

This paper conducts an exhaustive experimental study of various saliency methods using metrics such as faithfulness, localization, false-positives, sensitivity, and stability. The study concludes that no single explanation method dominates across all metrics, though Grad-CAM and RISE perform well in most.

It has been long debated that eXplainable AI (XAI) is an important topic, but it lacks rigorous definition and fair metrics. In this paper, we briefly summarize the status quo of the metrics, along with an exhaustive experimental study based on them, including faithfulness, localization, false-positives, sensitivity check, and stability. With the experimental results, we conclude that among all the methods we compare, no single explanation method dominates others in all metrics. Nonetheless, Gradient-weighted Class Activation Mapping (Grad-CAM) and Randomly Input Sampling for Explanation (RISE) perform fairly well in most of the metrics. Utilizing a set of filtered metrics, we further present a case study to diagnose the classification bases for models. While providing a comprehensive experimental study of metrics, we also examine measuring factors that are missed in current metrics and hope this valuable work could serve as a guide for future research.

View on arXiv PDF

Similar