MLLGNov 8, 2021

Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes

arXiv:2111.04558v212 citations
AI Analysis

This work addresses the computational bottleneck in variable selection for high-dimensional Gaussian processes, which is an incremental improvement for practitioners in machine learning and statistics.

The paper tackles the problem of slow and costly variable selection in high-dimensional Gaussian processes by developing a fast and scalable variational inference algorithm with spike and slab priors, achieving runtimes similar to sparse variational GPs even with n=10^6 and up to 1000 times faster than MCMC-based methods while performing competitively.

Variable selection in Gaussian processes (GPs) is typically undertaken by thresholding the inverse lengthscales of automatic relevance determination kernels, but in high-dimensional datasets this approach can be unreliable. A more probabilistically principled alternative is to use spike and slab priors and infer a posterior probability of variable inclusion. However, existing implementations in GPs are very costly to run in both high-dimensional and large-$n$ datasets, or are only suitable for unsupervised settings with specific kernels. As such, we develop a fast and scalable variational inference algorithm for the spike and slab GP that is tractable with arbitrary differentiable kernels. We improve our algorithm's ability to adapt to the sparsity of relevant variables by Bayesian model averaging over hyperparameters, and achieve substantial speed ups using zero temperature posterior restrictions, dropout pruning and nearest neighbour minibatching. In experiments our method consistently outperforms vanilla and sparse variational GPs whilst retaining similar runtimes (even when $n=10^6$) and performs competitively with a spike and slab GP using MCMC but runs up to $1000$ times faster.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes