Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models
This work addresses the robustness of deep learning models on ImageNet for researchers and practitioners, but it is incremental as it builds on existing adversarial training methods with architectural tweaks.
The paper tackles the problem of adversarial training on ImageNet by comparing ViTs and ConvNeXts, finding that minor architectural changes like replacing PatchStem with ConvStem significantly increase robustness in seen threat models and improve generalization to unseen attacks, with ConvNeXt + ConvStem achieving the most robust models and ViT + ConvStem yielding the best generalization.
While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen $\ell_\infty$-threat model, but even more so improve generalization to unseen $\ell_1/\ell_2$-attacks. Our modified ConvNeXt, ConvNeXt + ConvStem, yields the most robust $\ell_\infty$-models across different ranges of model parameters and FLOPs, while our ViT + ConvStem yields the best generalization to unseen threat models.