LGFeb 14, 2023

The Missing Margin: How Sample Corruption Affects Distance to the Boundary in ANNs

Marthinus W. Theunissen, Coenraad Mouton, Marelie H. Davel

arXiv:2302.06925v13.81 citationsh-index: 22

Originality Synthesis-oriented

AI Analysis

This work provides incremental insights into margin analysis for researchers in machine learning generalization.

The study investigated how sample corruption affects classification margins in artificial neural networks, finding that certain training samples have consistently small margins and linking this to distance metrics and generalization.

Classification margins are commonly used to estimate the generalization ability of machine learning models. We present an empirical study of these margins in artificial neural networks. A global estimate of margin size is usually used in the literature. In this work, we point out seldom considered nuances regarding classification margins. Notably, we demonstrate that some types of training samples are modelled with consistently small margins while affecting generalization in different ways. By showing a link with the minimum distance to a different-target sample and the remoteness of samples from one another, we provide a plausible explanation for this observation. We support our findings with an analysis of fully-connected networks trained on noise-corrupted MNIST data, as well as convolutional networks trained on noise-corrupted CIFAR10 data.

View on arXiv PDF

Similar