LGAIJan 19

Explanation Multiplicity in SHAP: Characterization and Assessment

arXiv:2601.12654v13 citations
Originality Incremental advance
AI Analysis

This addresses reliability issues in post-hoc explanations for high-stakes automated decisions, though it is incremental in characterizing an existing problem.

The paper tackles the problem of explanation multiplicity in SHAP, where multiple valid but different explanations exist for the same prediction, and finds this phenomenon is pervasive across datasets and persists even for high-confidence predictions.

Post-hoc explanations are widely used to justify, contest, and audit automated decisions in high-stakes domains. SHAP, in particular, is often treated as a reliable account of which features drove an individual prediction. Yet SHAP explanations can vary substantially across repeated runs even when the input, task, and trained model are held fixed. We term this phenomenon explanation multiplicity: multiple internally valid but substantively different explanations for the same decision. We present a methodology to characterize multiplicity in feature-attribution explanations and to disentangle sources due to model training/selection from stochasticity intrinsic to the explanation pipeline. We further show that apparent stability depends on the metric: magnitude-based distances can remain near zero while rank-based measures reveal substantial churn in the identity and ordering of top features. To contextualize observed disagreement, we derive randomized baseline values under plausible null models. Across datasets, model classes, and confidence regimes, we find explanation multiplicity is pervasive and persists even for high-confidence predictions, highlighting the need for metrics and baselines that match the intended use of explanations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes