AINov 11, 2022

REVEL Framework to measure Local Linear Explanations for black-box models: Deep Learning Image Classification case of study

arXiv:2211.06154v19 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the lack of consensus and theoretical inconsistencies in evaluating explanations for AI models, which is crucial for improving interpretability in domains like image classification, though it appears incremental as it builds on existing qualitative proposals.

The authors tackled the problem of evaluating the quality of local linear explanations for black-box models, particularly in image classification, by proposing the REVEL framework, which standardizes explanation concepts and introduces metrics for comparison and absolute assessment, demonstrating its descriptive and analytical power on four benchmark datasets.

Explainable artificial intelligence is proposed to provide explanations for reasoning performed by an Artificial Intelligence. There is no consensus on how to evaluate the quality of these explanations, since even the definition of explanation itself is not clear in the literature. In particular, for the widely known Local Linear Explanations, there are qualitative proposals for the evaluation of explanations, although they suffer from theoretical inconsistencies. The case of image is even more problematic, where a visual explanation seems to explain a decision while detecting edges is what it really does. There are a large number of metrics in the literature specialized in quantitatively measuring different qualitative aspects so we should be able to develop metrics capable of measuring in a robust and correct way the desirable aspects of the explanations. In this paper, we propose a procedure called REVEL to evaluate different aspects concerning the quality of explanations with a theoretically coherent development. This procedure has several advances in the state of the art: it standardizes the concepts of explanation and develops a series of metrics not only to be able to compare between them but also to obtain absolute information regarding the explanation itself. The experiments have been carried out on image four datasets as benchmark where we show REVEL's descriptive and analytical power.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes