LG AI CRSep 9, 2023

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

Jinwen He, Kai Chen, Guozhu Meng, Jiangshan Zhang, Congyi Li

arXiv:2309.05679v15.35 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for reliable evaluation of explanation methods in AI interpretability, which is crucial for model debugging and security, though it is incremental as it builds on existing faithfulness testing.

The paper tackles the problem of evaluating the faithfulness of local explanation methods for deep learning models, proposing three trend-based tests that outperform traditional tests on image, natural language, and security tasks, enabling assessment on complex data for the first time.

While enjoying the great achievements brought by deep learning (DL), people are also worried about the decision made by DL models, since the high degree of non-linearity of DL models makes the decision extremely difficult to understand. Consequently, attacks such as adversarial attacks are easy to carry out, but difficult to detect and explain, which has led to a boom in the research on local explanation methods for explaining model decisions. In this paper, we evaluate the faithfulness of explanation methods and find that traditional tests on faithfulness encounter the random dominance problem, \ie, the random selection performs the best, especially for complex data. To further solve this problem, we propose three trend-based faithfulness tests and empirically demonstrate that the new trend tests can better assess faithfulness than traditional tests on image, natural language and security tasks. We implement the assessment system and evaluate ten popular explanation methods. Benefiting from the trend tests, we successfully assess the explanation methods on complex data for the first time, bringing unprecedented discoveries and inspiring future research. Downstream tasks also greatly benefit from the tests. For example, model debugging equipped with faithful explanation methods performs much better for detecting and correcting accuracy and security problems.

View on arXiv PDF Code

Similar