Parametric Gaussian Process Regressors
This work addresses the issue of poor uncertainty estimates in scalable GP regression, which is important for practitioners relying on accurate probabilistic predictions, though it is incremental as it builds on existing inducing point and variational methods.
The authors tackled the problem of underestimated predictive uncertainties in scalable Gaussian Process regression using inducing points and stochastic variational inference, proposing two methods that significantly improved uncertainty calibration and log likelihoods by up to half a nat per datapoint.
The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et.~al.~2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods--often by as much as half a nat per datapoint.