Parametric Gaussian Process Regression for Big Data
This work addresses the problem of uncertainty quantification in big data for researchers and practitioners in machine learning, though it appears incremental as it builds on existing Gaussian process methods.
The paper tackles the challenge of scaling Gaussian processes to big data by introducing parametric Gaussian processes, which avoid the need for stochastic variational inference and demonstrate effectiveness on a dataset with about 6 million records.
This work introduces the concept of parametric Gaussian processes (PGPs), which is built upon the seemingly self-contradictory idea of making Gaussian processes parametric. Parametric Gaussian processes, by construction, are designed to operate in "big data" regimes where one is interested in quantifying the uncertainty associated with noisy data. The proposed methodology circumvents the well-established need for stochastic variational inference, a scalable algorithm for approximating posterior distributions. The effectiveness of the proposed approach is demonstrated using an illustrative example with simulated data and a benchmark dataset in the airline industry with approximately 6 million records.