LG MLJun 17, 2019

MixUp as Directional Adversarial Training

Guillaume P. Archambault, Yongyi Mao, Hongyu Guo, Richong Zhang

arXiv:1906.06875v111.123 citations

Originality Incremental advance

AI Analysis

This work provides a theoretical explanation for MixUp's success and introduces a more general family of schemes, offering incremental insights for machine learning practitioners.

The authors explain MixUp's effectiveness by showing it is equivalent to a special case of directional adversarial training (DAT), a new class of adversarial training schemes that perturbs examples toward others while keeping original labels, and they introduce Untied MixUp schemes that empirically improve upon MixUp.

In this work, we explain the working mechanism of MixUp in terms of adversarial training. We introduce a new class of adversarial training schemes, which we refer to as directional adversarial training, or DAT. In a nutshell, a DAT scheme perturbs a training example in the direction of another example but keeps its original label as the training target. We prove that MixUp is equivalent to a special subclass of DAT, in that it has the same expected loss function and corresponds to the same optimization problem asymptotically. This understanding not only serves to explain the effectiveness of MixUp, but also reveals a more general family of MixUp schemes, which we call Untied MixUp. We prove that the family of Untied MixUp schemes is equivalent to the entire class of DAT schemes. We establish empirically the existence of Untied Mixup schemes which improve upon MixUp.

View on arXiv PDF

Similar