Fast DPP Sampling for Nyström with Application to Kernel Methods
This addresses the computational bottleneck in kernel methods for machine learning practitioners, though it appears incremental as it builds on existing DPP and Nyström frameworks.
The paper tackles the problem of landmark selection for the Nyström method in kernel methods by using Determinantal Point Processes (DPPs), proving error bounds and showing that Markov chain DPP sampling can achieve linear time complexity under certain conditions, with empirical results demonstrating superior performance over existing approaches.
The Nyström method has long been popular for scaling up kernel methods. Its theoretical guarantees and empirical performance rely critically on the quality of the landmarks selected. We study landmark selection for Nyström using Determinantal Point Processes (DPPs), discrete probability models that allow tractable generation of diverse samples. We prove that landmarks selected via DPPs guarantee bounds on approximation errors; subsequently, we analyze implications for kernel ridge regression. Contrary to prior reservations due to cubic complexity of DPPsampling, we show that (under certain conditions) Markov chain DPP sampling requires only linear time in the size of the data. We present several empirical results that support our theoretical analysis, and demonstrate the superior performance of DPP-based landmark selection compared with existing approaches.