Adversarial Training: A Survey
It provides a systematic review for researchers and practitioners working on adversarial robustness in machine learning, but is incremental as a survey paper.
This survey addresses the lack of a comprehensive overview of adversarial training (AT), which integrates adversarial examples into training to improve deep neural network robustness against attacks, by reviewing recent studies across implementation, techniques, challenges, and future directions.
Adversarial training (AT) refers to integrating adversarial examples -- inputs altered with imperceptible perturbations that can significantly impact model predictions -- into the training process. Recent studies have demonstrated the effectiveness of AT in improving the robustness of deep neural networks against diverse adversarial attacks. However, a comprehensive overview of these developments is still missing. This survey addresses this gap by reviewing a broad range of recent and representative studies. Specifically, we first describe the implementation procedures and practical applications of AT, followed by a comprehensive review of AT techniques from three perspectives: data enhancement, network design, and training configurations. Lastly, we discuss common challenges in AT and propose several promising directions for future research.