The Effects of Hyperparameters on SGD Training of Neural Networks
This work addresses the problem of limited hyperparameter exploration for researchers, but it appears incremental as it focuses on experimental analysis without introducing new methods.
The paper investigates how hyperparameters like learning rate, batch size, and depth affect neural network training, reporting results from large-scale experiments to explore their interactions.
The performance of neural network classifiers is determined by a number of hyperparameters, including learning rate, batch size, and depth. A number of attempts have been made to explore these parameters in the literature, and at times, to develop methods for optimizing them. However, exploration of parameter spaces has often been limited. In this note, I report the results of large scale experiments exploring these different parameters and their interactions.