On the Tractability of SHAP Explanations
This addresses a fundamental problem for researchers and practitioners in explainable AI by revealing inherent computational limits, making it a foundational contribution.
The paper tackled the computational complexity of computing SHAP explanations for machine learning models, showing that it is intractable in three settings: for fully-factorized distributions with models like logistic regression, for trivial classifiers over naive Bayes distributions, and over empirical distributions where it is #P-hard.
SHAP explanations are a popular feature-attribution mechanism for explainable AI. They use game-theoretic notions to measure the influence of individual features on the prediction of a machine learning model. Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently. In this paper, we establish the complexity of computing the SHAP explanation in three important settings. First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model. This fully-factorized setting is often used to simplify the SHAP computation, yet our results show that the computation can be intractable for commonly used models such as logistic regression. Going beyond fully-factorized distributions, we show that computing SHAP explanations is already intractable for a very simple setting: computing SHAP explanations of trivial classifiers over naive Bayes distributions. Finally, we show that even computing SHAP over the empirical distribution is #P-hard.