LGMLJan 3, 2020

Relative Flatness and Generalization

arXiv:2001.00939v4102 citations
AI Analysis

This work provides a theoretical foundation for understanding generalization in machine learning, which is crucial for researchers and practitioners aiming to improve model robustness and performance.

The paper tackles the open problem of why flatness of the loss curve correlates with generalization in neural networks, addressing issues like reparameterization invariance, and introduces a relative flatness measure that strongly correlates with generalization and resolves these theoretical challenges.

Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with generalization, it is still an open theoretical problem why and under which circumstances flatness is connected to generalization, in particular in light of reparameterizations that change certain flatness measures but leave generalization unchanged. We investigate the connection between flatness and generalization by relating it to the interpolation from representative data, deriving notions of representativeness, and feature robustness. The notions allow us to rigorously connect flatness and generalization and to identify conditions under which the connection holds. Moreover, they give rise to a novel, but natural relative flatness measure that correlates strongly with generalization, simplifies to ridge regression for ordinary least squares, and solves the reparameterization issue.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes