MLLGApr 15, 2022

Towards a Unified Framework for Uncertainty-aware Nonlinear Variable Selection with Theoretical Guarantees

arXiv:2204.07293v24 citationsh-index: 94
AI Analysis

This work addresses the challenge of selecting important variables in nonlinear models with theoretical guarantees, which is crucial for interpretability in domains like healthcare, though it is incremental as it builds on existing variable selection techniques.

The authors tackled the problem of nonlinear variable selection by developing a unified framework that incorporates uncertainty quantification and is compatible with various machine learning models, achieving superior performance over existing methods in simulations and healthcare datasets.

We develop a simple and unified framework for nonlinear variable selection that incorporates uncertainty in the prediction function and is compatible with a wide range of machine learning models (e.g., tree ensembles, kernel methods, neural networks, etc). In particular, for a learned nonlinear model $f(\mathbf{x})$, we consider quantifying the importance of an input variable $\mathbf{x}^j$ using the integrated partial derivative $Ψ_j = \Vert \frac{\partial}{\partial \mathbf{x}^j} f(\mathbf{x})\Vert^2_{P_\mathcal{X}}$. We then (1) provide a principled approach for quantifying variable selection uncertainty by deriving its posterior distribution, and (2) show that the approach is generalizable even to non-differentiable models such as tree ensembles. Rigorous Bayesian nonparametric theorems are derived to guarantee the posterior consistency and asymptotic uncertainty of the proposed approach. Extensive simulations and experiments on healthcare benchmark datasets confirm that the proposed algorithm outperforms existing classic and recent variable selection methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes