LGAIMLMay 21

Proxy-Based Approximation of Shapley and Banzhaf Interactions

arXiv:2605.2273877.8
AI Analysis

This work provides a practical, accurate estimator for higher-order feature interactions in machine learning, addressing a key bottleneck in model interpretability.

ProxySHAP introduces a proxy-based approximation for Shapley and Banzhaf interactions that achieves state-of-the-art approximation quality, outperforming prior methods like ProxySPEX and KernelSHAP-IQ in both small- and large-budget regimes, including large-scale applications with thousands of features.

Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes