Understanding Interventional TreeSHAP : How and Why it Works
This work addresses a theoretical gap for researchers and practitioners in interpretable machine learning, though it is incremental as it builds on existing algorithms without introducing new methods.
The paper tackles the lack of a formal proof for Interventional TreeSHAP, an efficient algorithm for computing Shapley values in tree ensembles, and provides a proof that clarifies its workings and extends to related interpretability methods like Shapley-Taylor indices and one-hot-encoded features.
Shapley values are ubiquitous in interpretable Machine Learning due to their strong theoretical background and efficient implementation in the SHAP library. Computing these values previously induced an exponential cost with respect to the number of input features of an opaque model. Now, with efficient implementations such as Interventional TreeSHAP, this exponential burden is alleviated assuming one is explaining ensembles of decision trees. Although Interventional TreeSHAP has risen in popularity, it still lacks a formal proof of how/why it works. We provide such proof with the aim of not only increasing the transparency of the algorithm but also to encourage further development of these ideas. Notably, our proof for Interventional TreeSHAP is easily adapted to Shapley-Taylor indices and one-hot-encoded features.