LGNov 1, 2021

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

arXiv:2111.00856v219 citations
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers and practitioners dealing with scalability challenges in deep learning, but it is incremental as a survey paper.

This survey examines optimization techniques for large-scale deep learning, focusing on improving model accuracy and efficiency by addressing training time, communication overhead, and memory usage.

Deep learning have achieved promising results on a wide spectrum of AI applications. Larger datasets and models consistently yield better performance. However, we generally spend longer training time on more computation and communication. In this survey, we aim to provide a clear sketch about the optimizations for large-scale deep learning with regard to the model accuracy and model efficiency. We investigate algorithms that are most commonly used for optimizing, elaborate the debatable topic of generalization gap arises in large-batch training, and review the SOTA strategies in addressing the communication overhead and reducing the memory footprints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes