LGMLJun 27, 2020

Stochastic Batch Augmentation with An Effective Distilled Dynamic Soft Label Regularizer

arXiv:2006.15284v116 citations
Originality Incremental advance
AI Analysis

This work addresses a specific issue in data augmentation for deep learning, offering incremental improvements in training efficiency and model robustness for computer vision applications.

The authors tackled the problem of ignoring the distributional relationship between original and augmented data in deep neural network training, which affects generalization and adversarial robustness, by proposing Stochastic Batch Augmentation (SBA) with a distilled dynamic soft label regularizer, resulting in improved generalization and faster convergence on CIFAR-10, CIFAR-100, and ImageNet datasets.

Data augmentation have been intensively used in training deep neural network to improve the generalization, whether in original space (e.g., image space) or representation space. Although being successful, the connection between the synthesized data and the original data is largely ignored in training, without considering the distribution information that the synthesized samples are surrounding the original sample in training. Hence, the behavior of the network is not optimized for this. However, that behavior is crucially important for generalization, even in the adversarial setting, for the safety of the deep learning system. In this work, we propose a framework called Stochastic Batch Augmentation (SBA) to address these problems. SBA stochastically decides whether to augment at iterations controlled by the batch scheduler and in which a ''distilled'' dynamic soft label regularization is introduced by incorporating the similarity in the vicinity distribution respect to raw samples. The proposed regularization provides direct supervision by the KL-Divergence between the output soft-max distributions of original and virtual data. Our experiments on CIFAR-10, CIFAR-100, and ImageNet show that SBA can improve the generalization of the neural networks and speed up the convergence of network training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes