ML LG MEDec 8, 2025

$φ$-test: Global Feature Selection and Inference for Shapley Additive Explanations

Dongseok Kim, Hyoungsun Choi, Mohamed Jismy Aashik Rasool, Gisung Oh

arXiv:2512.07578v1h-index: 1

Originality Incremental advance

AI Analysis

This provides a practical global explanation layer linking Shapley-based importance summaries with classical statistical inference for researchers and practitioners in machine learning interpretability.

The paper tackles the problem of global feature selection and significance testing for black-box predictors by proposing $\phi$-test, which combines Shapley attributions with selective inference to output scores, coefficients, and p-values for retained features. Experiments on real tabular regression tasks show that $\phi$-test retains much of the predictive ability while using only a few features and producing stable feature sets across resamples and backbone classes.

We propose $φ$-test, a global feature-selection and significance procedure for black-box predictors that combines Shapley attributions with selective inference. Given a trained model and an evaluation dataset, $φ$-test performs SHAP-guided screening and fits a linear surrogate on the screened features via a selection rule with a tractable selective-inference form. For each retained feature, it outputs a Shapley-based global score, a surrogate coefficient, and post-selection $p$-values and confidence intervals in a global feature-importance table. Experiments on real tabular regression tasks with tree-based and neural backbones suggest that $φ$-test can retain much of the predictive ability of the original model while using only a few features and producing feature sets that remain fairly stable across resamples and backbone classes. In these settings, $φ$-test acts as a practical global explanation layer linking Shapley-based importance summaries with classical statistical inference.

View on arXiv PDF

Similar