LGCRMLFeb 14, 2020

Adversarial Distributional Training for Robust Deep Learning

arXiv:2002.05999v2150 citations
AI Analysis

This addresses the issue of limited robustness against unseen attacks in deep learning models, offering a novel approach for enhancing adversarial defense.

The paper tackles the problem of unreliable robustness in adversarial training by introducing adversarial distributional training (ADT), a framework that learns adversarial distributions to characterize potential adversarial examples, resulting in improved robustness validated on benchmarks.

Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. However, most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. Besides, a single attack algorithm could be insufficient to explore the space of perturbations. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models. ADT is formulated as a minimax optimization problem, where the inner maximization aims to learn an adversarial distribution to characterize the potential adversarial examples around a natural one under an entropic regularizer, and the outer minimization aims to train robust models by minimizing the expected loss over the worst-case adversarial distributions. Through a theoretical analysis, we develop a general algorithm for solving ADT, and present three approaches for parameterizing the adversarial distributions, ranging from the typical Gaussian distributions to the flexible implicit ones. Empirical results on several benchmarks validate the effectiveness of ADT compared with the state-of-the-art AT methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes