LGAICVITNEMay 5, 2025

Sharpness-Aware Minimization with Z-Score Gradient Filtering

arXiv:2505.02369v5Has Code
Originality Incremental advance
AI Analysis

This work addresses generalization challenges in deep learning optimization, offering an incremental improvement over Sharpness-Aware Minimization for researchers and practitioners in computer vision.

The paper tackles the problem of generalization in deep neural networks by proposing Z-Score Filtered Sharpness-Aware Minimization, which filters gradients to focus on significant components, resulting in improved test accuracy on datasets like CIFAR-10, CIFAR-100, and Tiny-ImageNet compared to existing methods.

Deep neural networks achieve high performance across many domains but can still face challenges in generalization when optimization is influenced by small or noisy gradient components. Sharpness-Aware Minimization improves generalization by perturbing parameters toward directions of high curvature, but it uses the entire gradient vector, which means that small or noisy components may affect the ascent step and cause the optimizer to miss optimal solutions. We propose Z-Score Filtered Sharpness-Aware Minimization, which applies Z-score based filtering to gradients in each layer. Instead of using all gradient components, a mask is constructed to retain only the top percentile with the largest absolute Z-scores. The percentile threshold $Q_p$ determines how many components are kept, so that the ascent step focuses on directions that stand out most compared to the average of the layer. This selective perturbation refines the search toward flatter minima while reducing the influence of less significant gradients. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet with architectures including ResNet, VGG, and Vision Transformers show that the proposed method consistently improves test accuracy compared to Sharpness-Aware Minimization and its variants. The code repository is available at: https://github.com/YUNBLAK/Sharpness-Aware-Minimization-with-Z-Score-Gradient-Filtering

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes