LGAug 7, 2023

G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima

arXiv:2308.03236v21 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses over-parameterization and limited data issues in deep learning, offering an incremental improvement for training more robust models.

The paper tackles the problem of improving generalization in deep neural networks by proposing G-Mix, a framework combining Mixup and Sharpness-Aware Minimization to achieve flatter minima, with experimental results showing state-of-the-art performance across multiple datasets and models.

Deep neural networks (DNNs) have demonstrated promising results in various complex tasks. However, current DNNs encounter challenges with over-parameterization, especially when there is limited training data available. To enhance the generalization capability of DNNs, the Mixup technique has gained popularity. Nevertheless, it still produces suboptimal outcomes. Inspired by the successful Sharpness-Aware Minimization (SAM) approach, which establishes a connection between the sharpness of the training loss landscape and model generalization, we propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models. The theoretical analysis provided demonstrates how the developed G-Mix framework enhances generalization. Additionally, to further optimize DNN performance with the G-Mix framework, we introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix. These algorithms partition the training data into two subsets based on the sharpness-sensitivity of each example to address the issue of "manifold intrusion" in Mixup. Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models, achieving state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes