ML CO MEApr 5, 2016

Fast methods for training Gaussian processes on large data sets

Christopher J. Moore, Alvin J. K. Chua, Christopher P. L. Berry, Jonathan R. Gair

arXiv:1604.01250v244 citations

Originality Incremental advance

AI Analysis

This work addresses a bottleneck for researchers and practitioners using Gaussian process regression on large datasets, though it appears incremental as it builds on existing methods.

The paper tackled the computational cost of training Gaussian processes on large datasets by deriving simple results to speed up learning and Bayesian model comparison, achieving quantified speed-ups relative to nested sampling on synthetic and real data.

Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.

View on arXiv PDF

Similar