LG MLJan 27, 2019

Augment your batch: better training with larger batches

Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry

arXiv:1901.09335v120.579 citations

Originality Incremental advance

AI Analysis

This method addresses scaling issues in deep neural network training for researchers and practitioners, offering a simple incremental improvement.

The paper tackles the problem of poor generalization in large-batch SGD training by proposing batch augmentation, which replicates samples with different data augmentations within a batch, resulting in reduced SGD updates needed to achieve state-of-the-art accuracy.

Large-batch SGD is important for scaling training of deep neural networks. However, without fine-tuning hyperparameter schedules, the generalization of the model may be hampered. We propose to use batch augmentation: replicating instances of samples within the same batch with different data augmentations. Batch augmentation acts as a regularizer and an accelerator, increasing both generalization and performance scaling. We analyze the effect of batch augmentation on gradient variance and show that it empirically improves convergence for a wide variety of deep neural networks and datasets. Our results show that batch augmentation reduces the number of necessary SGD updates to achieve the same accuracy as the state-of-the-art. Overall, this simple yet effective method enables faster training and better generalization by allowing more computational resources to be used concurrently.

View on arXiv PDF

Similar