A synthetic dataset for deep learning
This provides a novel experimental tool for researchers to verify deep learning theories, though it is incremental as it builds on existing datasets like MNIST.
The authors tackled the problem of lacking datasets with known distributions for deep learning theory verification by generating a synthetic dataset with an explicit Gaussian distribution, resulting in a tool that mimics MNIST characteristics for easy DNN application.
In this paper, we propose a novel method for generating a synthetic dataset obeying Gaussian distribution. Compared to the commonly used benchmark datasets with unknown distribution, the synthetic dataset has an explicit distribution, i.e., Gaussian distribution. Meanwhile, it has the same characteristics as the benchmark dataset MNIST. As a result, we can easily apply Deep Neural Networks (DNNs) on the synthetic dataset. This synthetic dataset provides a novel experimental tool to verify the proposed theories of deep learning.