LGAug 9, 2021

Training of deep residual networks with stochastic MG/OPT

Cyrill von Planta, Alena Kopanicakova, Rolf Krause

arXiv:2108.04052v14.44 citations

Originality Incremental advance

AI Analysis

This work addresses training efficiency and robustness for deep residual networks, offering incremental improvements in optimization methods.

The paper tackles the problem of training deep residual networks by introducing a stochastic variant of the nonlinear multigrid method MG/OPT, leveraging a dynamical systems viewpoint for hierarchy construction, and reports significant speed-ups and robustness improvements on MNIST, with multilevel training also showing potential as a pruning technique.

We train deep residual networks with a stochastic variant of the nonlinear multigrid method MG/OPT. To build the multilevel hierarchy, we use the dynamical systems viewpoint specific to residual networks. We report significant speed-ups and additional robustness for training MNIST on deep residual networks. Our numerical experiments also indicate that multilevel training can be used as a pruning technique, as many of the auxiliary networks have accuracies comparable to the original network.

View on arXiv PDF

Similar