ML LGApr 15, 2022

Towards a Unified Framework for Uncertainty-aware Nonlinear Variable Selection with Theoretical Guarantees

Wenying Deng, Beau Coker, Rajarshi Mukherjee, Jeremiah Zhe Liu, Brent A. Coull

arXiv:2204.07293v23.84 citationsh-index: 94

Originality Incremental advance

AI Analysis

This work addresses the challenge of selecting important variables in nonlinear models with theoretical guarantees, which is crucial for interpretability in domains like healthcare, though it is incremental as it builds on existing variable selection techniques.

The authors tackled the problem of nonlinear variable selection by developing a unified framework that incorporates uncertainty quantification and is compatible with various machine learning models, achieving superior performance over existing methods in simulations and healthcare datasets.

We develop a simple and unified framework for nonlinear variable selection that incorporates uncertainty in the prediction function and is compatible with a wide range of machine learning models (e.g., tree ensembles, kernel methods, neural networks, etc). In particular, for a learned nonlinear model $f(\mathbf{x})$, we consider quantifying the importance of an input variable $\mathbf{x}^j$ using the integrated partial derivative $Ψ_j = \Vert \frac{\partial}{\partial \mathbf{x}^j} f(\mathbf{x})\Vert^2_{P_\mathcal{X}}$. We then (1) provide a principled approach for quantifying variable selection uncertainty by deriving its posterior distribution, and (2) show that the approach is generalizable even to non-differentiable models such as tree ensembles. Rigorous Bayesian nonparametric theorems are derived to guarantee the posterior consistency and asymptotic uncertainty of the proposed approach. Extensive simulations and experiments on healthcare benchmark datasets confirm that the proposed algorithm outperforms existing classic and recent variable selection methods.

View on arXiv PDF

Similar