LGAIOct 27, 2023

How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors?

arXiv:2310.18496v19 citationsh-index: 41
Originality Incremental advance
AI Analysis

This addresses a critical evaluation gap in Explainable AI for high-stakes domains, though it is incremental as it focuses on a specific class of models and explainers.

The paper investigates whether popular feature-additive explainers (e.g., LIME, SHAP) can accurately explain feature-additive predictors, finding that all explainers fail to correctly attribute feature importance, especially when feature interactions are involved.

Surging interest in deep learning from high-stakes domains has precipitated concern over the inscrutable nature of black box neural networks. Explainable AI (XAI) research has led to an abundance of explanation algorithms for these black boxes. Such post hoc explainers produce human-comprehensible explanations, however, their fidelity with respect to the model is not well understood - explanation evaluation remains one of the most challenging issues in XAI. In this paper, we ask a targeted but important question: can popular feature-additive explainers (e.g., LIME, SHAP, SHAPR, MAPLE, and PDP) explain feature-additive predictors? Herein, we evaluate such explainers on ground truth that is analytically derived from the additive structure of a model. We demonstrate the efficacy of our approach in understanding these explainers applied to symbolic expressions, neural networks, and generalized additive models on thousands of synthetic and several real-world tasks. Our results suggest that all explainers eventually fail to correctly attribute the importance of features, especially when a decision-making process involves feature interactions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes