Empirical study of extreme overfitting points of neural networks
This addresses the issue of understanding generalization failures in neural networks for researchers, though it appears incremental as it builds on existing knowledge about critical points.
The authors tackled the problem of extreme overfitting in neural networks by identifying parameters that yield near-perfect training accuracy but near-zero test accuracy, and they studied the properties and location of these points on the loss function surface.
In this paper we propose a method of obtaining points of extreme overfitting - parameters of modern neural networks, at which they demonstrate close to 100 % training accuracy, simultaneously with almost zero accuracy on the test sample. Despite the widespread opinion that the overwhelming majority of critical points of the loss function of a neural network have equally good generalizing ability, such points have a huge generalization error. The paper studies the properties of such points and their location on the surface of the loss function of modern neural networks.