LGDec 2, 2020

On the Error Resistance of Hinge Loss Minimization

arXiv:2012.00989v14.26 citations

Originality Incremental advance

AI Analysis

This work provides theoretical guarantees for the robustness of common classification algorithms, which is important for practitioners dealing with noisy datasets.

This paper investigates the robustness of hinge loss minimization algorithms to errors in training data. It shows that under specific data conditions (linearly separable with a margin of at least C/sqrt(d), near-isotropic and log-concave class-conditional distributions), these algorithms can learn the correct classifier even when a constant fraction of examples are adversarially mislabeled.

Commonly used classification algorithms in machine learning, such as support vector machines, minimize a convex surrogate loss on training examples. In practice, these algorithms are surprisingly robust to errors in the training data. In this work, we identify a set of conditions on the data under which such surrogate loss minimization algorithms provably learn the correct classifier. This allows us to establish, in a unified framework, the robustness of these algorithms under various models on data as well as error. In particular, we show that if the data is linearly classifiable with a slightly non-trivial margin (i.e. a margin at least $C/\sqrt{d}$ for $d$-dimensional unit vectors), and the class-conditional distributions are near isotropic and logconcave, then surrogate loss minimization has negligible error on the uncorrupted data even when a constant fraction of examples are adversarially mislabeled.

View on arXiv PDF

Similar