LGNAOCJul 15, 2021

Globally Convergent Multilevel Training of Deep Residual Networks

arXiv:2107.07572v222 citations
AI Analysis

This work addresses training efficiency and convergence issues in deep learning for researchers and practitioners, representing an incremental improvement over existing multilevel methods.

The authors tackled the problem of training deep residual networks by proposing a globally convergent multilevel training method that adaptively adjusts mini-batch sizes and incorporates curvature information across all levels. The method demonstrated improved performance in classification and regression tasks, though specific numerical results were not provided in the abstract.

We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trust-region (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting mini-batch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of our multilevel training method are numerically investigated using examples from the field of classification and regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes