CVMay 19, 2024

AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

Suorong Yang, Peijia Li, Xin Xiong, Furao Shen, Jian Zhao

arXiv:2405.11467v310.512 citationsh-index: 9Has CodeIEEE Transactions on Image Processing

Originality Highly original

AI Analysis

This addresses a key limitation in data augmentation for deep learning practitioners by providing a tuning-free adaptive approach to improve generalization, though it is incremental as it builds on existing DA methods with a novel adaptation mechanism.

The paper tackles the problem of fixed or random augmentation magnitudes causing misalignment with model training status, leading to underfitting and overfitting risks, and proposes AdaAugment, an adaptive method that uses reinforcement learning to dynamically adjust magnitudes, resulting in consistent outperformance of state-of-the-art DA methods across benchmarks.

Data augmentation (DA) is widely employed to improve the generalization performance of deep models. However, most existing DA methods employ augmentation operations with fixed or random magnitudes throughout the training process. While this fosters data diversity, it can also inevitably introduce uncontrolled variability in augmented data, which could potentially cause misalignment with the evolving training status of the target models. Both theoretical and empirical findings suggest that this misalignment increases the risks of both underfitting and overfitting. To address these limitations, we propose AdaAugment, an innovative and tuning-free adaptive augmentation method that leverages reinforcement learning to dynamically and adaptively adjust augmentation magnitudes for individual training samples based on real-time feedback from the target network. Specifically, AdaAugment features a dual-model architecture consisting of a policy network and a target network, which are jointly optimized to adapt augmentation magnitudes in accordance with the model's training progress effectively. The policy network optimizes the variability within the augmented data, while the target network utilizes the adaptively augmented samples for training. These two networks are jointly optimized and mutually reinforce each other. Extensive experiments across benchmark datasets and deep architectures demonstrate that AdaAugment consistently outperforms other state-of-the-art DA methods in effectiveness while maintaining remarkable efficiency. Code is available at https://github.com/Jackbrocp/AdaAugment.

View on arXiv PDF Code

Similar