MLLGFeb 6, 2014

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

arXiv:1402.1389v2153 citations
AI Analysis

This work addresses the scalability problem for researchers and practitioners using Gaussian processes in big data applications, offering an incremental improvement through distributed variational inference.

The authors tackled the scalability of Gaussian processes (GPs) to big datasets by introducing a novel re-parametrisation of variational inference for sparse GP regression and latent variable models, enabling an efficient distributed algorithm that scales well with data and computational resources, as demonstrated on flight data with 2 million records and MNIST, showing GPs outperform many common big data models.

Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have been applied to both regression and non-linear dimensionality reduction, and offer desirable properties such as uncertainty estimates, robustness to over-fitting, and principled ways for tuning hyper-parameters. However the scalability of these models to big datasets remains an active topic of research. We introduce a novel re-parametrisation of variational inference for sparse GP regression and latent variable models that allows for an efficient distributed algorithm. This is done by exploiting the decoupling of the data given the inducing points to re-formulate the evidence lower bound in a Map-Reduce setting. We show that the inference scales well with data and computational resources, while preserving a balanced distribution of the load among the nodes. We further demonstrate the utility in scaling Gaussian processes to big data. We show that GP performance improves with increasing amounts of data in regression (on flight data with 2 million records) and latent variable modelling (on MNIST). The results show that GPs perform better than many common models often used for big data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes