Training Neural Networks at Any Scale
It provides an introductory overview for practitioners and researchers on scalable neural network training, but is incremental as it reviews existing methods.
The paper reviews modern optimization methods for training neural networks, focusing on efficiency and scalability, and presents state-of-the-art algorithms under a unified template to adapt to problem structures.
This article reviews modern optimization methods for training neural networks with an emphasis on efficiency and scale. We present state-of-the-art optimization algorithms under a unified algorithmic template that highlights the importance of adapting to the structures in the problem. We then cover how to make these algorithms agnostic to the scale of the problem. Our exposition is intended as an introduction for both practitioners and researchers who wish to be involved in these exciting new developments.