MLJun 7, 2014

Compressed Gaussian Process

arXiv:1406.1916v12 citations
Originality Incremental advance
AI Analysis

This addresses the problem of scalable regression for big data in machine learning, offering an incremental improvement over partitioning methods.

The paper tackles nonparametric regression for massive datasets by proposing a method using random compression with Gaussian process regression, avoiding partitioning and achieving state-of-the-art predictive performance with rapid implementation.

Nonparametric regression for massive numbers of samples (n) and features (p) is an increasingly important problem. In big n settings, a common strategy is to partition the feature space, and then separately apply simple models to each partition set. We propose an alternative approach, which avoids such partitioning and the associated sensitivity to neighborhood choice and distance metrics, by using random compression combined with Gaussian process regression. The proposed approach is particularly motivated by the setting in which the response is conditionally independent of the features given the projection to a low dimensional manifold. Conditionally on the random compression matrix and a smoothness parameter, the posterior distribution for the regression surface and posterior predictive distributions are available analytically. Running the analysis in parallel for many random compression matrices and smoothness parameters, model averaging is used to combine the results. The algorithm can be implemented rapidly even in very big n and p problems, has strong theoretical justification, and is found to yield state of the art predictive performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes