VASSO: Variance Suppression for Sharpness-Aware Minimization
This work addresses a specific bottleneck in SAM for improving generalization in deep neural networks, representing an incremental advancement.
The paper tackles the problem of 'over-friendly adversaries' in Sharpness-Aware Minimization (SAM), which limits generalization, by proposing VASSO, a variance suppression method that stabilizes adversaries. The result is improved generalization validated on vision and language tasks, with a desirable generalization-computation tradeoff when integrated with an efficient SAM variant.
Sharpness-aware minimization (SAM) has well-documented merits in enhancing generalization of deep neural network models. Accounting for sharpness in the loss function geometry, where neighborhoods of `flat minima' heighten generalization ability, SAM seeks `flat valleys' by minimizing the maximum loss provoked by an adversarial perturbation within the neighborhood. Although critical to account for sharpness of the loss function, in practice SAM suffers from `over-friendly adversaries,' which can curtail the outmost level of generalization. To avoid such `friendliness,' the present contribution fosters stabilization of adversaries through variance suppression (VASSO). VASSO offers a general approach to provably stabilize adversaries. In particular, when integrating VASSO with SAM, improved generalizability is numerically validated on extensive vision and language tasks. Once applied on top of a computationally efficient SAM variant, VASSO offers a desirable generalization-computation tradeoff.