LGCLDec 16, 2021

Sharpness-Aware Minimization with Dynamic Reweighting

arXiv:2112.08772v4292 citations
Originality Incremental advance
AI Analysis

This work addresses an efficiency bottleneck in adversarial training for deep neural networks, offering a more practical method for researchers and practitioners in machine learning, though it is incremental as it builds directly on SAM.

The paper tackled the computational inefficiency of per-instance adversarial weight perturbations in sharpness-aware minimization (SAM) by proposing delta-SAM, which uses dynamic reweighting of per-batch perturbations to approximate per-instance ones, achieving improved generalization on natural language understanding tasks.

Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting (delta-SAM). Our theoretical analysis motivates that it is possible to approach the stronger, per-instance adversarial weight perturbations using reweighted per-batch weight perturbations. delta-SAM dynamically reweights perturbation within each batch according to the theoretically principled weighting factors, serving as a good approximation to per-instance perturbation. Experiments on various natural language understanding tasks demonstrate the effectiveness of delta-SAM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes