Generalized SHAP: Generating multiple types of explanations in machine learning
This work addresses the need for more comprehensive model interpretability for researchers and practitioners, though it is incremental as it builds directly on the established SHAP framework.
The authors tackled the limitation of SHAP in answering diverse questions about machine learning models by generalizing it to G-SHAP, which generates multiple explanation types such as classification, intergroup differences, and model failure explanations, and demonstrated its practical use on real data.
Many important questions about a model cannot be answered just by explaining how much each feature contributes to its output. To answer a broader set of questions, we generalize a popular, mathematically well-grounded explanation technique, Shapley Additive Explanations (SHAP). Our new method - Generalized Shapley Additive Explanations (G-SHAP) - produces many additional types of explanations, including: 1) General classification explanations; Why is this sample more likely to belong to one class rather than another? 2) Intergroup differences; Why do our model's predictions differ between groups of observations? 3) Model failure; Why does our model perform poorly on a given sample? We formally define these types of explanations and illustrate their practical use on real data.