Accurate Shapley Values for explaining tree-based models
This work addresses the challenge of providing reliable explanations for tree-based models in explainable AI, which is incremental but improves practical accuracy.
The paper tackled the problem of inaccurate Shapley Values for explaining tree-based models by introducing two efficient estimators that exploit tree structure, resulting in more accurate explanations than state-of-the-art methods as shown in simulations.
Shapley Values (SV) are widely used in explainable AI, but their estimation and interpretation can be challenging, leading to inaccurate inferences and explanations. As a starting point, we remind an invariance principle for SV and derive the correct approach for computing the SV of categorical variables that are particularly sensitive to the encoding used. In the case of tree-based models, we introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. Simulations and comparisons are performed with state-of-the-art algorithms and show the practical gain of our approach. Finally, we discuss the limitations of Shapley Values as a local explanation. These methods are available as a Python package.