LGMLJan 27, 2019

Augment your batch: better training with larger batches

arXiv:1901.09335v179 citations
Originality Incremental advance
AI Analysis

This method addresses scaling issues in deep neural network training for researchers and practitioners, offering a simple incremental improvement.

The paper tackles the problem of poor generalization in large-batch SGD training by proposing batch augmentation, which replicates samples with different data augmentations within a batch, resulting in reduced SGD updates needed to achieve state-of-the-art accuracy.

Large-batch SGD is important for scaling training of deep neural networks. However, without fine-tuning hyperparameter schedules, the generalization of the model may be hampered. We propose to use batch augmentation: replicating instances of samples within the same batch with different data augmentations. Batch augmentation acts as a regularizer and an accelerator, increasing both generalization and performance scaling. We analyze the effect of batch augmentation on gradient variance and show that it empirically improves convergence for a wide variety of deep neural networks and datasets. Our results show that batch augmentation reduces the number of necessary SGD updates to achieve the same accuracy as the state-of-the-art. Overall, this simple yet effective method enables faster training and better generalization by allowing more computational resources to be used concurrently.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes