ST LG CO ME MLMay 29, 2015

On the Computational Complexity of High-Dimensional Bayesian Variable Selection

Yun Yang, Martin J. Wainwright, Michael I. Jordan

arXiv:1505.07925v113.0160 citationsh-index: 163

Originality Incremental advance

AI Analysis

This addresses computational bottlenecks for statisticians and machine learning practitioners working with high-dimensional data, offering incremental improvements in mixing time guarantees.

The paper tackles the computational complexity of MCMC methods for high-dimensional Bayesian variable selection, showing that statistical consistency does not guarantee rapid mixing, and provides conditions for a truncated prior to achieve both consistency and linear mixing time up to a logarithmic factor.

We study the computational complexity of Markov chain Monte Carlo (MCMC) methods for high-dimensional Bayesian linear regression under sparsity constraints. We first show that a Bayesian approach can achieve variable-selection consistency under relatively mild conditions on the design matrix. We then demonstrate that the statistical criterion of posterior concentration need not imply the computational desideratum of rapid mixing of the MCMC algorithm. By introducing a truncated sparsity prior for variable selection, we provide a set of conditions that guarantee both variable-selection consistency and rapid mixing of a particular Metropolis-Hastings algorithm. The mixing time is linear in the number of covariates up to a logarithmic factor. Our proof controls the spectral gap of the Markov chain by constructing a canonical path ensemble that is inspired by the steps taken by greedy algorithms for variable selection.

View on arXiv PDF

Similar