MLLGJun 1

ShaplEIG: Bayesian Experimental Design for Shapley Value Estimation

arXiv:2606.0224786.4
AI Analysis

For practitioners needing Shapley values under costly evaluations (e.g., retraining-based feature importance), this method offers a principled adaptive sampling approach that outperforms existing methods.

ShaplEIG uses Bayesian experimental design with Gaussian process surrogates to adaptively select coalitions for Shapley value estimation, achieving consistent improvements in sample efficiency (e.g., lower error) over state-of-the-art baselines in low-budget regimes.

Shapley values are a principled attribution measure widely used in interpretable machine learning, but their exact computation scales exponentially with the number of players, motivating a wide range of approximation methods based on value function evaluations of sampled coalitions. This raises the question of whether approximation accuracy can be improved by adaptively selecting coalitions for evaluation based on previous evaluations. This is particularly relevant in settings where the value function is costly and the number of evaluations is severely limited, such as retraining-based feature importance, data valuation, and hyperparameter importance. For this purpose, we propose ShaplEIG, a Bayesian experimental design approach that approximates the expensive value function using a Gaussian process surrogate and adaptively selects coalitions based on their expected information gain about the Shapley values. By the linearity of the Shapley values in the value function, we show that the expected information gain is available in closed form. Furthermore, we propose an efficient computation scheme that reduces the complexity from exponential to polynomial in the number of players via elementary symmetric polynomials. In extensive experiments across diverse costly applications, our method consistently improves sample efficiency in the low-budget regime over state-of-the-art baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes