Why Mixup Improves the Model Performance
This addresses the need for theoretical understanding of a widely used data augmentation technique for researchers and practitioners in machine learning, but it is incremental as it builds on existing empirical work.
The paper tackles the problem of over-fitting in machine learning models by investigating why the mixup regularization method improves performance, focusing on theoretical guarantees from statistical learning theory.
Machine learning techniques are used in a wide range of domains. However, machine learning models often suffer from the problem of over-fitting. Many data augmentation methods have been proposed to tackle such a problem, and one of them is called mixup. Mixup is a recently proposed regularization procedure, which linearly interpolates a random pair of training examples. This regularization method works very well experimentally, but its theoretical guarantee is not adequately discussed. In this study, we aim to discover why mixup works well from the aspect of the statistical learning theory.