A Bound on the Maximal Marginal Degrees of Freedom
This work provides theoretical justification for low-rank approximations in kernel ridge regression, benefiting practitioners dealing with large-scale data, though it is incremental as it builds on existing methods like Nyström.
The paper tackles the computational and memory challenges of kernel ridge regression by establishing a lower bound on the minimal rank for reliable low-rank approximations, showing that the Nyström method's cost is nearly linear in sample size and extending feasible regularization parameter choices.
Kernel ridge regression, in general, is expensive in memory allocation and computation time. This paper addresses low rank approximations and surrogates for kernel ridge regression, which bridge these difficulties. The fundamental contribution of the paper is a lower bound on the minimal rank such that the prediction power of the approximation remains reliable. Based on this bound, we demonstrate that the computational cost of the most popular low rank approach, which is the Nyström method, is almost linear in the sample size. This justifies the method from a theoretical point of view. Moreover, the paper provides a significant extension of the feasible choices of the regularization parameter. The result builds on a thorough theoretical analysis of the approximation of elementary kernel functions by elements in the range of the associated integral operator. We provide estimates of the approximation error and characterize the behavior of the norm of the underlying weight function.