Beyond Additivity: Sparse Isotonic Shapley Regression toward Nonlinear Explainability
This addresses the need for more accurate and efficient explainable AI methods in high-dimensional settings, representing a novel method for a known bottleneck.
The paper tackles the problem of distorted feature attributions in Shapley values due to non-additive payoff functions and high computational costs for sparse explanations, introducing Sparse Isotonic Shapley Regression (SISR) which stabilizes attributions and correctly filters irrelevant features in experiments.
Shapley values, a gold standard for feature attribution in Explainable AI, face two primary challenges. First, the canonical Shapley framework assumes that the worth function is additive, yet real-world payoff constructions--driven by non-Gaussian distributions, heavy tails, feature dependence, or domain-specific loss scales--often violate this assumption, leading to distorted attributions. Secondly, achieving sparse explanations in high dimensions by computing dense Shapley values and then applying ad hoc thresholding is prohibitively costly and risks inconsistency. We introduce Sparse Isotonic Shapley Regression (SISR), a unified nonlinear explanation framework. SISR simultaneously learns a monotonic transformation to restore additivity--obviating the need for a closed-form specification--and enforces an L0 sparsity constraint on the Shapley vector, enhancing computational efficiency in large feature spaces. Its optimization algorithm leverages Pool-Adjacent-Violators for efficient isotonic regression and normalized hard-thresholding for support selection, yielding implementation ease and global convergence guarantees. Analysis shows that SISR recovers the true transformation in a wide range of scenarios and achieves strong support recovery even in high noise. Moreover, we are the first to demonstrate that irrelevant features and inter-feature dependencies can induce a true payoff transformation that deviates substantially from linearity. Experiments in regression, logistic regression, and tree ensembles demonstrate that SISR stabilizes attributions across payoff schemes, correctly filters irrelevant features while standard Shapley values suffer severe rank and sign distortions. By unifying nonlinear transformation estimation with sparsity pursuit, SISR advances the frontier of nonlinear explainability, providing a theoretically grounded and practical attribution framework.