LGFeb 26

Fair feature attribution for multi-output prediction: a Shapley-based perspective

Umberto Biccari, Alain Ibáñez de Opakua, José María Mato, Óscar Millet, Roberto Morales, Enrique Zuazua

arXiv:2602.22882v11.4h-index: 63

Originality Incremental advance

AI Analysis

This work clarifies a structural constraint in Shapley-based interpretability for researchers and practitioners working with multi-output machine learning models.

This paper provides an axiomatic characterization of feature attribution for multi-output predictors within the Shapley framework. It demonstrates that any attribution rule satisfying the classical Shapley axioms must decompose component-wise across outputs, implying that joint-output attribution rules require relaxing at least one axiom.

In this article, we provide an axiomatic characterization of feature attribution for multi-output predictors within the Shapley framework. While SHAP explanations are routinely computed independently for each output coordinate, the theoretical necessity of this practice has remained unclear. By extending the classical Shapley axioms to vector-valued cooperative games, we establish a rigidity theorem showing that any attribution rule satisfying efficiency, symmetry, dummy player, and additivity must necessarily decompose component-wise across outputs. Consequently, any joint-output attribution rule must relax at least one of the classical Shapley axioms. This result identifies a previously unformalized structural constraint in Shapley-based interpretability, clarifying the precise scope of fairness-consistent explanations in multi-output learning. Numerical experiments on a biomedical benchmark illustrate that multi-output models can yield computational savings in training and deployment, while producing SHAP explanations that remain fully consistent with the component-wise structure imposed by the Shapley axioms.

View on arXiv PDF

Similar