Imbalanced Adversarial Training with Reweighting
This addresses a gap in adversarial robustness for imbalanced data, which is common in real-world applications, but the approach is incremental as it builds on existing adversarial training methods.
The paper tackles the problem of adversarial training on imbalanced datasets, revealing that adversarially trained models perform worse on under-represented classes and that traditional reweighting strategies are ineffective, and proposes Separable Reweighted Adversarial Training (SRAT) to improve performance by learning more separable features.
Adversarial training has been empirically proven to be one of the most effective and reliable defense methods against adversarial attacks. However, almost all existing studies about adversarial training are focused on balanced datasets, where each class has an equal amount of training examples. Research on adversarial training with imbalanced training datasets is rather limited. As the initial effort to investigate this problem, we reveal the facts that adversarially trained models present two distinguished behaviors from naturally trained models in imbalanced datasets: (1) Compared to natural training, adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is extremely imbalanced. (2) Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training. For example, upweighting the under-represented classes will drastically hurt the model's performance on well-represented classes, and as a result, finding an optimal reweighting value can be tremendously challenging. In this paper, to further understand our observations, we theoretically show that the poor data separability is one key reason causing this strong tension between under-represented and well-represented classes. Motivated by this finding, we propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios, by learning more separable features for different classes. Extensive experiments on various datasets verify the effectiveness of the proposed framework.