MLLGSTApr 25, 2020

Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime

arXiv:2004.12019v4117 citations
AI Analysis

This provides theoretical guarantees for interpolating classifiers in noisy, high-dimensional scenarios, which is incremental to existing work on clean data.

The paper analyzes the population risk of the maximum margin algorithm for linear classification in overparameterized settings with noisy data, showing it can achieve nearly optimal risk with sufficient over-parameterization.

We prove bounds on the population risk of the maximum margin algorithm for two-class linear classification. For linearly separable training data, the maximum margin algorithm has been shown in previous work to be equivalent to a limit of training with logistic loss using gradient descent, as the training error is driven to zero. We analyze this algorithm applied to random data including misclassification noise. Our assumptions on the clean data include the case in which the class-conditional distributions are standard normal distributions. The misclassification noise may be chosen by an adversary, subject to a limit on the fraction of corrupted labels. Our bounds show that, with sufficient over-parameterization, the maximum margin algorithm trained on noisy data can achieve nearly optimal population risk.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes