AINov 11, 2022
REVEL Framework to measure Local Linear Explanations for black-box models: Deep Learning Image Classification case of studyIván Sevillano-García, Julián Luengo-Martín, Francisco Herrera
Explainable artificial intelligence is proposed to provide explanations for reasoning performed by an Artificial Intelligence. There is no consensus on how to evaluate the quality of these explanations, since even the definition of explanation itself is not clear in the literature. In particular, for the widely known Local Linear Explanations, there are qualitative proposals for the evaluation of explanations, although they suffer from theoretical inconsistencies. The case of image is even more problematic, where a visual explanation seems to explain a decision while detecting edges is what it really does. There are a large number of metrics in the literature specialized in quantitatively measuring different qualitative aspects so we should be able to develop metrics capable of measuring in a robust and correct way the desirable aspects of the explanations. In this paper, we propose a procedure called REVEL to evaluate different aspects concerning the quality of explanations with a theoretically coherent development. This procedure has several advances in the state of the art: it standardizes the concepts of explanation and develops a series of metrics not only to be able to compare between them but also to obtain absolute information regarding the explanation itself. The experiments have been carried out on image four datasets as benchmark where we show REVEL's descriptive and analytical power.
AIApr 3, 2024
X-SHIELD: Regularization for eXplainable Artificial IntelligenceIván Sevillano-García, Julián Luengo, Francisco Herrera
As artificial intelligence systems become integral across domains, the demand for explainability grows, the called eXplainable artificial intelligence (XAI). Existing efforts primarily focus on generating and evaluating explanations for black-box models while a critical gap in directly enhancing models remains through these evaluations. It is important to consider the potential of this explanation process to improve model quality with a feedback on training as well. XAI may be used to improve model performance while boosting its explainability. Under this view, this paper introduces Transformation - Selective Hidden Input Evaluation for Learning Dynamics (T-SHIELD), a regularization family designed to improve model quality by hiding features of input, forcing the model to generalize without those features. Within this family, we propose the XAI - SHIELD(X-SHIELD), a regularization for explainable artificial intelligence, which uses explanations to select specific features to hide. In contrast to conventional approaches, X-SHIELD regularization seamlessly integrates into the objective function enhancing model explainability while also improving performance. Experimental validation on benchmark datasets underscores X-SHIELD's effectiveness in improving performance and overall explainability. The improvement is validated through experiments comparing models with and without the X-SHIELD regularization, with further analysis exploring the rationale behind its design choices. This establishes X-SHIELD regularization as a promising pathway for developing reliable artificial intelligence regularization.
LGApr 3, 2025
STOOD-X methodology: using statistical nonparametric test for OOD Detection Large-Scale datasets enhanced with explainabilityIván Sevillano-García, Julián Luengo, Francisco Herrera
Out-of-Distribution (OOD) detection is a critical task in machine learning, particularly in safety-sensitive applications where model failures can have serious consequences. However, current OOD detection methods often suffer from restrictive distributional assumptions, limited scalability, and a lack of interpretability. To address these challenges, we propose STOOD-X, a two-stage methodology that combines a Statistical nonparametric Test for OOD Detection with eXplainability enhancements. In the first stage, STOOD-X uses feature-space distances and a Wilcoxon-Mann-Whitney test to identify OOD samples without assuming a specific feature distribution. In the second stage, it generates user-friendly, concept-based visual explanations that reveal the features driving each decision, aligning with the BLUE XAI paradigm. Through extensive experiments on benchmark datasets and multiple architectures, STOOD-X achieves competitive performance against state-of-the-art post hoc OOD detectors, particularly in high-dimensional and complex settings. In addition, its explainability framework enables human oversight, bias detection, and model debugging, fostering trust and collaboration between humans and AI systems. The STOOD-X methodology therefore offers a robust, explainable, and scalable solution for real-world OOD detection tasks.