MLLGOct 30, 2020

Marginalised Gaussian Processes with Nested Sampling

arXiv:2010.16344v212 citations
Originality Incremental advance
AI Analysis

This work addresses uncertainty quantification and overfitting issues in Gaussian Process models, which is important for researchers and practitioners in machine learning, though it is incremental as it builds on existing methods with a specific technique.

The authors tackled the problem of underestimating predictive uncertainty and overfitting in Gaussian Process models by marginalizing kernel hyperparameters using Nested Sampling, resulting in substantial gains in predictive performance on synthetic and benchmark datasets and offering a speed advantage over Hamiltonian Monte Carlo.

Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function. Learning occurs through the optimisation of kernel hyperparameters using the marginal likelihood as the objective. This classical approach known as Type-II maximum likelihood (ML-II) yields point estimates of the hyperparameters, and continues to be the default method for training GPs. However, this approach risks underestimating predictive uncertainty and is prone to overfitting especially when there are many hyperparameters. Furthermore, gradient based optimisation makes ML-II point estimates highly susceptible to the presence of local minima. This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS), a technique that is well suited to sample from complex, multi-modal distributions. We focus on regression tasks with the spectral mixture (SM) class of kernels and find that a principled approach to quantifying model uncertainty leads to substantial gains in predictive performance across a range of synthetic and benchmark data sets. In this context, nested sampling is also found to offer a speed advantage over Hamiltonian Monte Carlo (HMC), widely considered to be the gold-standard in MCMC based inference.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes