Understanding Probabilistic Sparse Gaussian Process Approximations
This work addresses the computational bottleneck in Gaussian Process inference for large datasets, providing insights for practitioners, though it is incremental as it compares existing methods.
The paper investigates the theoretical and practical differences between two sparse Gaussian Process approximations, FITC and VFE, to guide their application in regression tasks.
Good sparse approximations are essential for practical inference in Gaussian Processes as the computational cost of exact methods is prohibitive for large datasets. The Fully Independent Training Conditional (FITC) and the Variational Free Energy (VFE) approximations are two recent popular methods. Despite superficial similarities, these approximations have surprisingly different theoretical properties and behave differently in practice. We thoroughly investigate the two methods for regression both analytically and through illustrative examples, and draw conclusions to guide practical application.