MLLGJan 29, 2022

Error Scaling Laws for Kernel Classification under Source and Capacity Conditions

arXiv:2201.12655v318 citations
Originality Synthesis-oriented
AI Analysis

This work provides scaling laws for kernel classification that are accurate on real datasets, addressing a gap in worst-case bounds, but it is incremental as it builds on standard conditions and methods.

The authors derived explicit decay rates for misclassification error in kernel classification under source and capacity conditions, showing these rates tightly fit learning curves for real datasets, with SVM and ridge classification compared.

We consider the problem of kernel classification. While worst-case bounds on the decay rate of the prediction error with the number of samples are known for some classifiers, they often fail to accurately describe the learning curves of real data sets. In this work, we consider the important class of data sets satisfying the standard source and capacity conditions, comprising a number of real data sets as we show numerically. Under the Gaussian design, we derive the decay rates for the misclassification (prediction) error as a function of the source and capacity coefficients. We do so for two standard kernel classification settings, namely margin-maximizing Support Vector Machines (SVM) and ridge classification, and contrast the two methods. We find that our rates tightly describe the learning curves for this class of data sets, and are also observed on real data. Our results can also be seen as an explicit prediction of the exponents of a scaling law for kernel classification that is accurate on some real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes