Explaining the Model and Feature Dependencies by Decomposition of the Shapley Value
This work addresses a key challenge in interpretable AI for end-users by enhancing Shapley value explanations, though it is incremental as it builds on existing methods.
The paper tackles the ambiguity in Shapley value explanations for machine learning models by proposing a decomposition method that separates model and feature dependency explanations, achieving intuitive results on simple problems and demonstrating equivalence or superiority to state-of-the-art implementations on real-world datasets.
Shapley values have become one of the go-to methods to explain complex models to end-users. They provide a model agnostic post-hoc explanation with foundations in game theory: what is the worth of a player (in machine learning, a feature value) in the objective function (the output of the complex machine learning model). One downside is that they always require outputs of the model when some features are missing. These are usually computed by taking the expectation over the missing features. This however introduces a non-trivial choice: do we condition on the unknown features or not? In this paper we examine this question and claim that they represent two different explanations which are valid for different end-users: one that explains the model and one that explains the model combined with the feature dependencies in the data. We propose a new algorithmic approach to combine both explanations, removing the burden of choice and enhancing the explanatory power of Shapley values, and show that it achieves intuitive results on simple problems. We apply our method to two real-world datasets and discuss the explanations. Finally, we demonstrate how our method is either equivalent or superior to state-to-of-art Shapley value implementations while simultaneously allowing for increased insight into the model-data structure.