Scalable Bayesian Transformed Gaussian Processes
This work makes Bayesian regression more practical for high-dimensional datasets by addressing a computational bottleneck, though it is incremental in improving an existing method.
The authors tackled the computational expense of Bayesian Transformed Gaussian Processes (BTG) by developing scalable methods using doubly sparse quadrature, quantile bounds, and rank-one algebra, enabling faster prediction and model selection comparable to maximum-likelihood estimation speeds, and demonstrated BTG's superior empirical performance over MLE-based models.
The Bayesian transformed Gaussian process (BTG) model, proposed by Kedem and Oliviera, is a fully Bayesian counterpart to the warped Gaussian process (WGP) and marginalizes out a joint prior over input warping and kernel hyperparameters. This fully Bayesian treatment of hyperparameters often provides more accurate regression estimates and superior uncertainty propagation, but is prohibitively expensive. The BTG posterior predictive distribution, itself estimated through high-dimensional integration, must be inverted in order to perform model prediction. To make the Bayesian approach practical and comparable in speed to maximum-likelihood estimation (MLE), we propose principled and fast techniques for computing with BTG. Our framework uses doubly sparse quadrature rules, tight quantile bounds, and rank-one matrix algebra to enable both fast model prediction and model selection. These scalable methods allow us to regress over higher-dimensional datasets and apply BTG with layered transformations that greatly improve its expressibility. We demonstrate that BTG achieves superior empirical performance over MLE-based models.