LGMay 22, 2022
Fast Gaussian Process Posterior Mean Prediction via Local Cross Validation and PrecomputationAlec M. Dunton, Benjamin W. Priest, Amanda Muyskens
Gaussian processes (GPs) are Bayesian non-parametric models useful in a myriad of applications. Despite their popularity, the cost of GP predictions (quadratic storage and cubic complexity with respect to the number of training points) remains a hurdle in applying GPs to large data. We present a fast posterior mean prediction algorithm called FastMuyGPs to address this shortcoming. FastMuyGPs is based upon the MuyGPs hyperparameter estimation algorithm and utilizes a combination of leave-one-out cross-validation, batching, nearest neighbors sparsification, and precomputation to provide scalable, fast GP prediction. We demonstrate several benchmarks wherein FastMuyGPs prediction attains superior accuracy and competitive or superior runtime to both deep neural networks and state-of-the-art scalable GP algorithms.
LGSep 22, 2022
Scalable Gaussian Process Hyperparameter Optimization via Coverage RegularizationKillian Wood, Alec M. Dunton, Amanda Muyskens et al.
Gaussian processes (GPs) are Bayesian non-parametric models popular in a variety of applications due to their accuracy and native uncertainty quantification (UQ). Tuning GP hyperparameters is critical to ensure the validity of prediction accuracy and uncertainty; uniquely estimating multiple hyperparameters in, e.g. the Matern kernel can also be a significant challenge. Moreover, training GPs on large-scale datasets is a highly active area of research: traditional maximum likelihood hyperparameter training requires quadratic memory to form the covariance matrix and has cubic training complexity. To address the scalable hyperparameter tuning problem, we present a novel algorithm which estimates the smoothness and length-scale parameters in the Matern kernel in order to improve robustness of the resulting prediction uncertainties. Using novel loss functions similar to those in conformal prediction algorithms in the computational framework provided by the hyperparameter estimation algorithm MuyGPs, we achieve improved UQ over leave-one-out likelihood maximization while maintaining a high degree of scalability as demonstrated in numerical experiments.
COApr 29, 2021
MuyGPs: Scalable Gaussian Process Hyperparameter Estimation Using Local Cross-ValidationAmanda Muyskens, Benjamin Priest, Imène Goumiri et al.
Gaussian processes (GPs) are non-linear probabilistic models popular in many applications. However, naïve GP realizations require quadratic memory to store the covariance matrix and cubic computation to perform inference or evaluate the likelihood function. These bottlenecks have driven much investment in the development of approximate GP alternatives that scale to the large data sizes common in modern data-driven applications. We present in this manuscript MuyGPs, a novel efficient GP hyperparameter estimation method. MuyGPs builds upon prior methods that take advantage of the nearest neighbors structure of the data, and uses leave-one-out cross-validation to optimize covariance (kernel) hyperparameters without realizing a possibly expensive likelihood. We describe our model and methods in detail, and compare our implementations against the state-of-the-art competitors in a benchmark spatial statistics problem. We show that our method outperforms all known competitors both in terms of time-to-solution and the root mean squared error of the predictions.