Adversarial Mixing Policy for Relaxing Locally Linear Constraints in Mixup
This work addresses regularization issues in deep classification networks, particularly in low-resource settings, offering an incremental improvement over existing Mixup variants.
The paper tackles the problem of under-fitting in Mixup regularization due to strict linear constraints, proposing the Adversarial Mixing Policy (AMP) to relax these constraints by adding adversarial perturbations to mixing coefficients, which reduces error rates by up to 31.3% on text classification benchmarks.
Mixup is a recent regularizer for current deep classification networks. Through training a neural network on convex combinations of pairs of examples and their labels, it imposes locally linear constraints on the model's input space. However, such strict linear constraints often lead to under-fitting which degrades the effects of regularization. Noticeably, this issue is getting more serious when the resource is extremely limited. To address these issues, we propose the Adversarial Mixing Policy (AMP), organized in a min-max-rand formulation, to relax the Locally Linear Constraints in Mixup. Specifically, AMP adds a small adversarial perturbation to the mixing coefficients rather than the examples. Thus, slight non-linearity is injected in-between the synthetic examples and synthetic labels. By training on these data, the deep networks are further regularized, and thus achieve a lower predictive error rate. Experiments on five text classification benchmarks and five backbone models have empirically shown that our methods reduce the error rate over Mixup variants in a significant margin (up to 31.3%), especially in low-resource conditions (up to 17.5%).