LGNov 1, 2021

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Xiaoxin He, Fuzhao Xue, Xiaozhe Ren, Yang You

arXiv:2111.00856v28.419 citations

Originality Synthesis-oriented

AI Analysis

It provides a comprehensive overview for researchers and practitioners dealing with scalability challenges in deep learning, but it is incremental as a survey paper.

This survey examines optimization techniques for large-scale deep learning, focusing on improving model accuracy and efficiency by addressing training time, communication overhead, and memory usage.

Deep learning have achieved promising results on a wide spectrum of AI applications. Larger datasets and models consistently yield better performance. However, we generally spend longer training time on more computation and communication. In this survey, we aim to provide a clear sketch about the optimizations for large-scale deep learning with regard to the model accuracy and model efficiency. We investigate algorithms that are most commonly used for optimizing, elaborate the debatable topic of generalization gap arises in large-batch training, and review the SOTA strategies in addressing the communication overhead and reducing the memory footprints.

View on arXiv PDF

Similar