MLLGJun 24, 2018

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

arXiv:1806.09077v457 citations
Originality Incremental advance
AI Analysis

This work addresses limitations of backpropagation for deep learning practitioners, offering an incremental improvement with potential applications to large-scale, online, and continual learning scenarios.

The paper tackles the challenge of training deep neural networks by proposing a novel online alternating minimization approach with auxiliary variables, which achieves promising empirical results on various architectures and datasets and provides the first theoretical convergence guarantees for AM in stochastic settings.

Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together with the first theoretical convergence guarantees for AM in stochastic settings and promising empirical results on a variety of architectures and datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes