LGJul 25, 2022

The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks

arXiv:2207.12547v22 citationsh-index: 19
Originality Synthesis-oriented
AI Analysis

This provides a foundational dataset for researchers studying neural network training dynamics, though it is incremental as it focuses on empirical data collection rather than new methods.

The authors tackled the problem of understanding training dynamics in fully connected neural networks by creating a large-scale empirical dataset of 483 thousand hyperparameter choices, resulting in 11 million training runs and 40 billion epochs recorded. They observed durable patterns across tasks and topologies, aiming to spark scientific study for theoretical advancements.

We present an empirical dataset surveying the deep learning phenomenon on fully-connected feed-forward multilayer perceptron neural networks. The dataset, which is now freely available online, records the per-epoch training and generalization performance of 483 thousand distinct hyperparameter choices of architectures, tasks, depths, network sizes (number of parameters), learning rates, batch sizes, and regularization penalties. Repeating each experiment an average of 24 times resulted in 11 million total training runs and 40 billion epochs recorded. Accumulating this 1.7 TB dataset utilized 11 thousand CPU core-years, 72.3 GPU-years, and 163 node-years. In surveying the dataset, we observe durable patterns persisting across tasks and topologies. We aim to spark scientific study of machine learning techniques as a catalyst for the theoretical discoveries needed to progress the field beyond energy-intensive and heuristic practices.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes