ML LGMar 4, 2013

Bayesian Compressed Regression

arXiv:1303.0642v278 citations

AI Analysis

This addresses computational and storage challenges in high-dimensional regression for statisticians and data scientists, offering a novel alternative to variable selection or shrinkage.

The authors tackled the problem of high-dimensional regression by proposing random compression of predictors to reduce storage and computational bottlenecks, achieving near parametric convergence rates for the predictive density and speeding up computation by many orders of magnitude compared to existing Bayesian methods.

As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimensionality reduction approaches, the exact posterior distribution conditional on the compressed data is available analytically, speeding up computation by many orders of magnitude while also bypassing robustness issues due to convergence and mixing problems with MCMC. Model averaging is used to reduce sensitivity to the random projection matrix, while accommodating uncertainty in the subspace dimension. Strong theoretical support is provided for the approach by showing near parametric convergence rates for the predictive density in the large p small n asymptotic paradigm. Practical performance relative to competitors is illustrated in simulations and real data applications.

View on arXiv PDF

Similar