Refutation of Shapley Values for XAI -- Additional Evidence
This work addresses a foundational issue in XAI by providing additional evidence against the reliability of Shapley values, which is incremental but broadens the scope of earlier critiques.
The paper demonstrates the inadequacy of Shapley values for explainable AI by extending counterexamples beyond Boolean classifiers to non-Boolean features and multi-class settings, and shows that minimal adversarial examples exclude irrelevant features, reinforcing the argument.
Recent work demonstrated the inadequacy of Shapley values for explainable artificial intelligence (XAI). Although to disprove a theory a single counterexample suffices, a possible criticism of earlier work is that the focus was solely on Boolean classifiers. To address such possible criticism, this paper demonstrates the inadequacy of Shapley values for families of classifiers where features are not boolean, but also for families of classifiers for which multiple classes can be picked. Furthermore, the paper shows that the features changed in any minimal $l_0$ distance adversarial examples do not include irrelevant features, thus offering further arguments regarding the inadequacy of Shapley values for XAI.