LG MLJun 21, 2025

Flatness After All?

Neta Shoham, Liron Mor-Yosef, Haim Avron

arXiv:2506.17809v2h-index: 25

Originality Incremental advance

AI Analysis

This work addresses the challenge of reliably estimating generalization in deep neural networks, which is crucial for researchers and practitioners in machine learning, though it is incremental as it builds on existing flatness theories.

The paper tackles the problem of assessing generalization in deep learning by proposing a soft rank measure of the Hessian to capture flatness, showing it accurately estimates generalization gaps for calibrated models and connects to established criteria for non-calibrated ones, with experimental results indicating robust performance compared to baselines.

Recent literature generalization in deep learning has examined the relationship between the curvature of the loss function at minima and generalization, mainly in the context of overparameterized neural networks. A key observation is that "flat" minima tend to generalize better than "sharp" minima. While this idea is supported by empirical evidence, it has also been shown that deep networks can generalize even with arbitrary sharpness, as measured by either the trace or the spectral norm of the Hessian. In this paper, we argue that generalization could be assessed by measuring flatness using a soft rank measure of the Hessian. We show that when an exponential family neural network model is exactly calibrated, and its prediction error and its confidence on the prediction are not correlated with the first and the second derivative of the network's output, our measure accurately captures the asymptotic expected generalization gap. For non-calibrated models, we connect a soft rank based flatness measure to the well-known Takeuchi Information Criterion and show that it still provides reliable estimates of generalization gaps for models that are not overly confident. Experimental results indicate that our approach offers a robust estimate of the generalization gap compared to baselines.

View on arXiv PDF

Similar