LG NE MLNov 29, 2020

Scaling Down Deep Learning with MNIST-1D

arXiv:2011.14439v514.328 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a low-compute, low-memory benchmark for researchers and educators to study fundamental deep learning phenomena without extensive computational resources, making deep learning research more accessible.

The paper introduces MNIST-1D, a minimalist 1D dataset with dimensionality 40 and a training set size of 4000, designed to enable rapid experimentation and study of deep learning phenomena. It demonstrates that MNIST-1D can be used to replicate various deep learning observations, such as inductive biases, lottery tickets, and double descent, with experiments completing in minutes on a CPU or GPU.

Although deep learning models have taken on commercial and political relevance, key aspects of their training and operation remain poorly understood. This has sparked interest in science of deep learning projects, many of which require large amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, procedurally generated, low-memory, and low-compute alternative to classic deep learning benchmarks. Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures, find lottery tickets, observe deep double descent, metalearn an activation function, and demonstrate guillotine regularization in self-supervised learning. All these experiments can be conducted on a GPU or often even on a CPU within minutes, allowing for fast prototyping, educational use cases, and cutting-edge research on a low budget.

View on arXiv PDF Code

Similar