CVOct 12, 2023

Saliency-Bench: A Comprehensive Benchmark for Evaluating Visual Explanations

Yifei Zhang, James Song, Siyi Gu, Tianxu Jiang, Bo Pan, Guangji Bai, Liang Zhao

arXiv:2310.08537v36.87 citationsh-index: 10

Originality Synthesis-oriented

AI Analysis

This provides a standardized tool for researchers in explainable AI to evaluate visual explanations, though it is incremental as it builds on existing saliency methods.

The authors tackled the lack of standardized evaluation for visual explanations in XAI by introducing Saliency-Bench, a benchmark suite with eight annotated datasets and a unified pipeline, resulting in a comprehensive assessment of saliency methods across diverse tasks.

Explainable AI (XAI) has gained significant attention for providing insights into the decision-making processes of deep learning models, particularly for image classification tasks through visual explanations visualized by saliency maps. Despite their success, challenges remain due to the lack of annotated datasets and standardized evaluation pipelines. In this paper, we introduce Saliency-Bench, a novel benchmark suite designed to evaluate visual explanations generated by saliency methods across multiple datasets. We curated, constructed, and annotated eight datasets, each covering diverse tasks such as scene classification, cancer diagnosis, object classification, and action classification, with corresponding ground-truth explanations. The benchmark includes a standardized and unified evaluation pipeline for assessing faithfulness and alignment of the visual explanation, providing a holistic visual explanation performance assessment. We benchmark these eight datasets with widely used saliency methods on different image classifier architectures to evaluate explanation quality. Additionally, we developed an easy-to-use API for automating the evaluation pipeline, from data accessing, and data loading, to result evaluation. The benchmark is available via our website: https://xaidataset.github.io.

View on arXiv PDF

Similar