Joint Shapley values: a measure of joint feature importance
This work provides a new measure for feature importance that could benefit researchers and practitioners in interpretable AI by offering insights into joint feature effects, though it appears incremental as it builds directly on existing Shapley value foundations.
The authors tackled the problem of measuring feature importance in machine learning models by introducing joint Shapley values, which extend the standard Shapley value to assess the average contribution of sets of features, and they proved its uniqueness and demonstrated intuitive results in ML attribution problems.
The Shapley value is one of the most widely used measures of feature importance partly as it measures a feature's average effect on a model's prediction. We introduce joint Shapley values, which directly extend Shapley's axioms and intuitions: joint Shapley values measure a set of features' average contribution to a model's prediction. We prove the uniqueness of joint Shapley values, for any order of explanation. Results for games show that joint Shapley values present different insights from existing interaction indices, which assess the effect of a feature within a set of features. The joint Shapley values provide intuitive results in ML attribution problems. With binary features, we present a presence-adjusted global value that is more consistent with local intuitions than the usual approach.