LGMLMar 5, 2018

Conducting Credit Assignment by Aligning Local Representations

arXiv:1803.01834v229 citations
Originality Incremental advance
AI Analysis

This addresses training difficulties for users experimenting with new deep network architectures, though it is incremental as it builds on existing alternative methods.

The paper tackles the problem of training deep networks with back-propagation, which is sensitive to initialization and difficult for new users, by introducing Local Representation Alignment (LRA) that is robust to bad initializations and works with various nonlinearities, achieving success on MNIST and Fashion MNIST even when back-propagation fails.

Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that may draw their inspiration from biology. A comprehensive set of experiments on MNIST and the much harder Fashion MNIST data sets show that LRA can be used to train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes