On the role of synaptic stochasticity in training low-precision neural networks
This work addresses the challenge of low-precision neural network training, which is relevant for hardware efficiency and biological modeling, but appears incremental as it builds on existing stochastic and binary weight methods.
The paper tackled the problem of training neural networks with stochastic binary weights, showing that this approach naturally finds rare, dense solution regions that offer robustness and good generalization, while typical solutions are isolated and hard to locate. It presented analytical and numerical results, including a gradient descent method for binary perceptrons and an extension to deep networks.
Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension aimed at training discrete deep neural networks is also investigated.