NE LGDec 22, 2017

Learning in the Machine: the Symmetries of the Deep Learning Channel

arXiv:1712.08608v113.031 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of local learning in physical neural systems, which is incremental as it builds on existing random backpropagation methods.

The paper tackles the problem of implementing biologically plausible learning rules in neural systems by analyzing the symmetries of the deep learning channel, showing through simulations that random backpropagation variations can achieve desirable symmetries and are robust, with mathematical results indicating convergence to fixed points.

In a physical neural system, learning rules must be local both in space and time. In order for learning to occur, non-local information must be communicated to the deep synapses through a communication channel, the deep learning channel. We identify several possible architectures for this learning channel (Bidirectional, Conjoined, Twin, Distinct) and six symmetry challenges: 1) symmetry of architectures; 2) symmetry of weights; 3) symmetry of neurons; 4) symmetry of derivatives; 5) symmetry of processing; and 6) symmetry of learning rules. Random backpropagation (RBP) addresses the second and third symmetry, and some of its variations, such as skipped RBP (SRBP) address the first and the fourth symmetry. Here we address the last two desirable symmetries showing through simulations that they can be achieved and that the learning channel is particularly robust to symmetry variations. Specifically, random backpropagation and its variations can be performed with the same non-linear neurons used in the main input-output forward channel, and the connections in the learning channel can be adapted using the same algorithm used in the forward channel, removing the need for any specialized hardware in the learning channel. Finally, we provide mathematical results in simple cases showing that the learning equations in the forward and backward channels converge to fixed points, for almost any initial conditions. In symmetric architectures, if the weights in both channels are small at initialization, adaptation in both channels leads to weights that are essentially symmetric during and after learning. Biological connections are discussed.

View on arXiv PDF

Similar