STLGMLMay 22, 2013

Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

arXiv:1305.5029v2405 citations
Originality Incremental advance
AI Analysis

This provides a scalable solution for kernel ridge regression in big data settings, though it is incremental as it builds on existing distributed learning approaches.

The paper tackles the computational challenge of kernel ridge regression on large datasets by proposing a distributed algorithm that partitions data into subsets, computes local estimators, and averages them. The method achieves the statistical minimax optimal rate while substantially reducing computation time, with theoretical guarantees allowing the number of processors to grow nearly linearly for certain kernels.

We establish optimal convergence rates for a decomposition-based scalable approach to kernel ridge regression. The method is simple to describe: it randomly partitions a dataset of size N into m subsets of equal size, computes an independent kernel ridge regression estimator for each subset, then averages the local solutions into a global predictor. This partitioning leads to a substantial reduction in computation time versus the standard approach of performing kernel ridge regression on all N samples. Our two main theorems establish that despite the computational speed-up, statistical optimality is retained: as long as m is not too large, the partition-based estimator achieves the statistical minimax rate over all estimators using the set of N samples. As concrete examples, our theory guarantees that the number of processors m may grow nearly linearly for finite-rank kernels and Gaussian kernels and polynomially in N for Sobolev spaces, which in turn allows for substantial reductions in computational cost. We conclude with experiments on both simulated data and a music-prediction task that complement our theoretical results, exhibiting the computational and statistical benefits of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes