Connections and Equivalences between the Nyström Method and Sparse Variational Gaussian Processes
This work addresses a gap in understanding between the Gaussian process and kernel communities, potentially facilitating knowledge transfer, but it is incremental as it builds on existing methods without introducing new paradigms.
The paper tackles the problem of clarifying connections between sparse approximation methods for kernel methods and Gaussian processes, specifically the Nyström method and Sparse Variational Gaussian Processes (SVGP), by establishing equivalences such as showing that the SVGP's Evidence Lower Bound contains the Nyström objective and providing an RKHS interpretation of SVGP.
We investigate the connections between sparse approximation methods for making kernel methods and Gaussian processes (GPs) scalable to large-scale data, focusing on the Nyström method and the Sparse Variational Gaussian Processes (SVGP). While sparse approximation methods for GPs and kernel methods share some algebraic similarities, the literature lacks a deep understanding of how and why they are related. This may pose an obstacle to the communications between the GP and kernel communities, making it difficult to transfer results from one side to the other. Our motivation is to remove this obstacle, by clarifying the connections between the sparse approximations for GPs and kernel methods. In this work, we study the two popular approaches, the Nyström and SVGP approximations, in the context of a regression problem, and establish various connections and equivalences between them. In particular, we provide an RKHS interpretation of the SVGP approximation, and show that the Evidence Lower Bound of the SVGP contains the objective function of the Nyström approximation, revealing the origin of the algebraic equivalence between the two approaches. We also study recently established convergence results for the SVGP and how they are related to the approximation quality of the Nyström method.