Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles
This work addresses the need for scalable interpretability in high-stakes domains using tree ensembles, though it is incremental as it builds on prior TreeSHAP methods.
The authors tackled the problem of efficiently computing any-order Shapley interactions for tree ensembles, which are crucial for interpreting complex models like gradient-boosted trees, and developed TreeSHAP-IQ, a method that computes these interactions in a single recursive traversal, achieving significant speed improvements over existing approaches.
While shallow decision trees may be interpretable, larger ensemble models like gradient-boosted trees, which often set the state of the art in machine learning problems involving tabular data, still remain black box models. As a remedy, the Shapley value (SV) is a well-known concept in explainable artificial intelligence (XAI) research for quantifying additive feature attributions of predictions. The model-specific TreeSHAP methodology solves the exponential complexity for retrieving exact SVs from tree-based models. Expanding beyond individual feature attribution, Shapley interactions reveal the impact of intricate feature interactions of any order. In this work, we present TreeSHAP-IQ, an efficient method to compute any-order additive Shapley interactions for predictions of tree-based models. TreeSHAP-IQ is supported by a mathematical framework that exploits polynomial arithmetic to compute the interaction scores in a single recursive traversal of the tree, akin to Linear TreeSHAP. We apply TreeSHAP-IQ on state-of-the-art tree ensembles and explore interactions on well-established benchmark datasets.