Further advantages of data augmentation on convolutional neural networks
This work addresses the problem of hyperparameter tuning in regularization for deep learning practitioners, though it is incremental as it builds on known benefits of data augmentation.
The paper systematically analyzes data augmentation versus explicit regularization techniques like weight decay and dropout in convolutional neural networks for image classification, finding that data augmentation alone adapts more easily to different architectures and training data amounts without hyperparameter fine-tuning.
Data augmentation is a popular technique largely used to enhance the training of convolutional neural networks. Although many of its benefits are well known by deep learning researchers and practitioners, its implicit regularization effects, as compared to popular explicit regularization techniques, such as weight decay and dropout, remain largely unstudied. As a matter of fact, convolutional neural networks for image object classification are typically trained with both data augmentation and explicit regularization, assuming the benefits of all techniques are complementary. In this paper, we systematically analyze these techniques through ablation studies of different network architectures trained with different amounts of training data. Our results unveil a largely ignored advantage of data augmentation: networks trained with just data augmentation more easily adapt to different architectures and amount of training data, as opposed to weight decay and dropout, which require specific fine-tuning of their hyperparameters.