MLLGJun 28, 2020

For interpolating kernel machines, minimizing the norm of the ERM solution minimizes stability

arXiv:2006.15522v26 citations
AI Analysis

This work addresses stability issues in overparameterized kernel methods for machine learning practitioners, providing theoretical insights into double descent phenomena, but it is incremental as it builds on existing interpolation and stability concepts.

The paper tackles the stability of kernel ridge-less regression in the interpolating regime, showing that the minimum norm solution minimizes a bound on leave-one-out cross-validation stability, which is linked to the condition number of the kernel matrix. In asymptotic analysis with random kernel matrices, this leads to a predicted double descent curve in test error.

We study the average $\mbox{CV}_{loo}$ stability of kernel ridge-less regression and derive corresponding risk bounds. We show that the interpolating solution with minimum norm minimizes a bound on $\mbox{CV}_{loo}$ stability, which in turn is controlled by the condition number of the empirical kernel matrix. The latter can be characterized in the asymptotic regime where both the dimension and cardinality of the data go to infinity. Under the assumption of random kernel matrices, the corresponding test error should be expected to follow a double descent curve.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes