LGJul 13, 2023

Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification

arXiv:2307.06565v110 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the efficiency bottleneck in deep learning training for practitioners, though it appears incremental as it builds on existing methods with specific theoretical improvements.

The paper tackles the problem of computationally expensive neural network training by proposing a method for identifying activated neurons in sublinear time using a geometric search data structure, achieving a provable convergence time of O(M^2/ε^2).

Deep learning has been widely used in many fields, but the model training process usually consumes massive computational resources and time. Therefore, designing an efficient neural network training method with a provable convergence guarantee is a fundamental and important research question. In this paper, we present a static half-space report data structure that consists of a fully connected two-layer neural network for shifted ReLU activation to enable activated neuron identification in sublinear time via geometric search. We also prove that our algorithm can converge in $O(M^2/ε^2)$ time with network size quadratic in the coefficient norm upper bound $M$ and error term $ε$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes