Ross D. Pantone

28.5NEJun 2, 2020

Training End-to-End Analog Neural Networks with Equilibrium Propagation

Jack Kendall, Ross Pantone, Kalpana Manickavasagam et al.

We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent. In these analog neural networks, the weights to be adjusted are implemented by the conductances of programmable resistive devices such as memristors [Chua, 1971], and the nonlinear transfer functions (or `activation functions') are implemented by nonlinear components such as diodes. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. This property enables us to train them using the Equilibrium Propagation framework [Scellier and Bengio, 2017]. Our update rule for each conductance, which is local and relies solely on the voltage drop across the corresponding resistor, is shown to compute the gradient of the loss function. Our numerical simulations, which use the SPICE-based Spectre simulation framework to simulate the dynamics of electrical circuits, demonstrate training on the MNIST classification task, performing comparably or better than equivalent-size software-based neural networks. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.

4.3ETMar 3, 2020

Deep Learning in Memristive Nanowire Networks

Jack D. Kendall, Ross D. Pantone, Juan C. Nino

Analog crossbar architectures for accelerating neural network training and inference have made tremendous progress over the past several years. These architectures are ideal for dense layers with fewer than roughly a thousand neurons. However, for large sparse layers, crossbar architectures are highly inefficient. A new hardware architecture, dubbed the MN3 (Memristive Nanowire Neural Network), was recently described as an efficient architecture for simulating very wide, sparse neural network layers, on the order of millions of neurons per layer. The MN3 utilizes a high-density memristive nanowire mesh to efficiently connect large numbers of silicon neurons with modifiable weights. Here, in order to explore the MN3's ability to function as a deep neural network, we describe one algorithm for training deep MN3 models and benchmark simulations of the architecture on two deep learning tasks. We utilize a simple piecewise linear memristor model, since we seek to demonstrate that training is, in principle, possible for randomized nanowire architectures. In future work, we intend on utilizing more realistic memristor models, and we will adapt the presented algorithm appropriately. We show that the MN3 is capable of performing composition, gradient propagation, and weight updates, which together allow it to function as a deep neural network. We show that a simulated multilayer perceptron (MLP), built from MN3 networks, can obtain a 1.61% error rate on the popular MNIST dataset, comparable to equivalently sized software-based network. This work represents, to the authors' knowledge, the first randomized nanowire architecture capable of reproducing the backpropagation algorithm.

Ross D. Pantone

2 Papers