Learning Sample Reweighting for Accuracy and Adversarial Robustness
This addresses the problem of balancing accuracy and robustness for neural network classifiers under adversarial attacks, representing an incremental advancement in adversarial training techniques.
The paper tackles the trade-off between standard and adversarial robustness accuracy in neural networks by proposing a sample reweighting adversarial training framework based on class-conditioned margin, resulting in consistent improvements in both clean and robust accuracy compared to state-of-the-art methods.
There has been great interest in enhancing the robustness of neural network classifiers to defend against adversarial perturbations through adversarial training, while balancing the trade-off between robust accuracy and standard accuracy. We propose a novel adversarial training framework that learns to reweight the loss associated with individual training samples based on a notion of class-conditioned margin, with the goal of improving robust generalization. We formulate weighted adversarial training as a bilevel optimization problem with the upper-level problem corresponding to learning a robust classifier, and the lower-level problem corresponding to learning a parametric function that maps from a sample's \textit{multi-class margin} to an importance weight. Extensive experiments demonstrate that our approach consistently improves both clean and robust accuracy compared to related methods and state-of-the-art baselines.