LGAINov 18, 2022

SAMSON: Sharpness-Aware Minimization Scaled by Outlier Normalization for Improving DNN Generalization and Robustness

arXiv:2211.11561v21 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses the challenge of energy-efficient DNN accelerators being prone to performance degradation due to non-idealities, offering a hardware-agnostic solution that avoids the typical trade-off between performance and robustness.

The paper tackles the problem of improving deep neural network robustness to noisy hardware at inference time without requiring hardware-specific knowledge, by proposing SAMSON, an adaptive sharpness-aware training method that conditions perturbations on weight magnitude and distribution range, resulting in better generalization and robustness compared to existing methods.

Energy-efficient deep neural network (DNN) accelerators are prone to non-idealities that degrade DNN performance at inference time. To mitigate such degradation, existing methods typically add perturbations to the DNN weights during training to simulate inference on noisy hardware. However, this often requires knowledge about the target hardware and leads to a trade-off between DNN performance and robustness, decreasing the former to increase the latter. In this work, we show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy hardware at inference time without relying on any assumptions about the target hardware. In particular, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier minimization (SAMSON). Our approach outperforms existing sharpness-aware training methods both in terms of model generalization performance in noiseless regimes and robustness in noisy settings, as measured on several architectures and datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes