LGOct 9, 2025

Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization

arXiv:2510.07758v21 citationsh-index: 3
Originality Highly original
AI Analysis

This work addresses a fundamental problem in machine learning by providing a more reliable sharpness measure for understanding and improving generalization, with incremental but practical gains for neural network training.

The authors tackled the weak correlation between existing sharpness measures and generalization in neural networks by proposing Rényi sharpness, a novel measure based on Rényi entropy of the loss Hessian, which shows strong correlation with generalization in experiments and leads to a method (RSAM) that improves test accuracy by up to 2.5% over prior methods.

Sharpness (of the loss minima) is a common measure to investigate the generalization of neural networks. Intuitively speaking, the flatter the landscape near the minima is, the better generalization might be. Unfortunately, the correlation between many existing sharpness measures and the generalization is usually not strong, sometimes even weak. To close the gap between the intuition and the reality, we propose a novel sharpness measure, i.e., \textit{Rényi sharpness}, which is defined as the negative Rényi entropy (a generalization of the classical Shannon entropy) of the loss Hessian. The main ideas are as follows: 1) we realize that \textit{uniform} (identical) eigenvalues of the loss Hessian is most desirable (while keeping the sum constant) to achieve good generalization; 2) we employ the \textit{Rényi entropy} to concisely characterize the extent of the spread of the eigenvalues of loss Hessian. Normally, the larger the spread, the smaller the (Rényi) entropy. To rigorously establish the relationship between generalization and (Rényi) sharpness, we provide several generalization bounds in terms of Rényi sharpness, by taking advantage of the reparametrization invariance property of Rényi sharpness, as well as the trick of translating the data discrepancy to the weight perturbation. Furthermore, extensive experiments are conducted to verify the strong correlation (in specific, Kendall rank correlation) between the Rényi sharpness and generalization. Moreover, we propose to use a variant of Rényi Sharpness as regularizer during training, i.e., Rényi Sharpness Aware Minimization (RSAM), which turns out to outperform all existing sharpness-aware minimization methods. It is worthy noting that the test accuracy gain of our proposed RSAM method could be as high as nearly 2.5\%, compared against the classical SAM method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes