LGAICVDec 7, 2025

Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

arXiv:2512.06665v1h-index: 6
Originality Incremental advance
AI Analysis

This work addresses the need for more accurate robustness evaluation in explainable AI, which is crucial for researchers and practitioners relying on feature attribution methods, though it is incremental in refining existing evaluation approaches.

The paper tackles the problem of evaluating the robustness of feature attribution methods for deep neural networks by introducing a new definition of similar inputs, a new robustness metric, and a GAN-based method to generate these inputs, resulting in a more objective evaluation that reveals method weaknesses rather than network flaws.

This paper studies the robustness of feature attribution methods for deep neural networks. It challenges the current notion of attributional robustness that largely ignores the difference in the model's outputs and introduces a new way of evaluating the robustness of attribution methods. Specifically, we propose a new definition of similar inputs, a new robustness metric, and a novel method based on generative adversarial networks to generate these inputs. In addition, we present a comprehensive evaluation with existing metrics and state-of-the-art attribution methods. Our findings highlight the need for a more objective metric that reveals the weaknesses of an attribution method rather than that of the neural network, thus providing a more accurate evaluation of the robustness of attribution methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes