LGAIJan 15

X-SAM: Boosting Sharpness-Aware Minimization with Dominant-Eigenvector Gradient Correction

arXiv:2601.10251v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses a theoretical and practical limitation in SAM for improving generalization in machine learning models, though it is an incremental improvement over existing methods.

The paper tackled the issue that Sharpness-Aware Minimization (SAM) may not effectively regularize sharpness due to gradient misalignment, by proposing X-SAM, which corrects the gradient using the leading eigenvector of the Hessian, resulting in proven convergence and improved generalization in experiments.

Sharpness-Aware Minimization (SAM) aims to improve generalization by minimizing a worst-case perturbed loss over a small neighborhood of model parameters. However, during training, its optimization behavior does not always align with theoretical expectations, since both sharp and flat regions may yield a small perturbed loss. In such cases, the gradient may still point toward sharp regions, failing to achieve the intended effect of SAM. To address this issue, we investigate SAM from a spectral and geometric perspective: specifically, we utilize the angle between the gradient and the leading eigenvector of the Hessian as a measure of sharpness. Our analysis illustrates that when this angle is less than or equal to ninety degrees, the effect of SAM's sharpness regularization can be weakened. Furthermore, we propose an explicit eigenvector-aligned SAM (X-SAM), which corrects the gradient via orthogonal decomposition along the top eigenvector, enabling more direct and efficient regularization of the Hessian's maximum eigenvalue. We prove X-SAM's convergence and superior generalization, with extensive experimental evaluations confirming both theoretical and practical advantages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes