Sparse Gaussian Process Hyperparameters: Optimize or Integrate?
This work addresses hyperparameter uncertainty in sparse Gaussian processes for machine learning practitioners, but it is incremental as it builds on prior variational frameworks.
The authors tackled the problem of hyperparameter uncertainty in sparse Gaussian processes, which can cause biased estimates and underestimated predictive uncertainty, by proposing an MCMC-based algorithm that samples hyperparameters within a variational framework, improving sampling efficiency compared to existing methods.
The kernel function and its hyperparameters are the central model selection choice in a Gaussian proces (Rasmussen and Williams, 2006). Typically, the hyperparameters of the kernel are chosen by maximising the marginal likelihood, an approach known as Type-II maximum likelihood (ML-II). However, ML-II does not account for hyperparameter uncertainty, and it is well-known that this can lead to severely biased estimates and an underestimation of predictive uncertainty. While there are several works which employ a fully Bayesian characterisation of GPs, relatively few propose such approaches for the sparse GPs paradigm. In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of Titsias (2009). This work is closely related to Hensman et al. (2015b) but side-steps the need to sample the inducing points, thereby significantly improving sampling efficiency in the Gaussian likelihood case. We compare this scheme against natural baselines in literature along with stochastic variational GPs (SVGPs) along with an extensive computational analysis.