On Regularization Properties of Artificial Datasets for Deep Learning
This addresses data scarcity issues for deep learning practitioners, but it appears incremental as it draws similarities to existing regularization methods.
The paper tackles the problem of training neural networks when real data is scarce by using artificial datasets, demonstrating that generating data by injecting noise into high-level features acts as a form of deep regularization for hidden layers.
The paper discusses regularization properties of artificial data for deep learning. Artificial datasets allow to train neural networks in the case of a real data shortage. It is demonstrated that the artificial data generation process, described as injecting noise to high-level features, bears several similarities to existing regularization methods for deep neural networks. One can treat this property of artificial data as a kind of "deep" regularization. It is thus possible to regularize hidden layers of the network by generating the training data in a certain way.