LGCVDec 27, 2022

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

CMUDeepMind
arXiv:2212.13381v512 citationsh-index: 57
Originality Incremental advance
AI Analysis

This work addresses the need for better generalization in deep learning models across multiple domains, but it is incremental as it builds upon the existing Mixup method.

The paper tackled the problem of improving Mixup, a data augmentation technique, by analyzing it from a directional derivative perspective and proposing a theoretically justified enhanced version. The result showed improvements across various domains, such as a 0.8% increase in ImageNet top-1 accuracy.

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes