ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions
This addresses the scalability problem for kernel methods in machine learning practitioners, though it appears incremental as it builds on existing partitioning and projection techniques.
The paper tackles the computational challenge of large-scale kernel ridge regression by introducing ParK, which combines feature space partitioning with random projections and iterative optimization to reduce space and time complexity while maintaining statistical accuracy. The method demonstrates effectiveness through numerical experiments on large-scale datasets.
We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.