Rounding Methods for Neural Networks with Low Resolution Synaptic Weights
This work addresses the challenge of implementing neural networks on dedicated hardware with low-resolution weights, which is incremental as it builds on existing rounding techniques.
The paper tackles the problem of performance loss when reducing neural network weight resolution for hardware implementation, proposing two methods that substantially outperform standard rounding and demonstrating their applicability to three common algorithms under fixed memory constraints.
Neural network algorithms simulated on standard computing platforms typically make use of high resolution weights, with floating-point notation. However, for dedicated hardware implementations of such algorithms, fixed-point synaptic weights with low resolution are preferable. The basic approach of reducing the resolution of the weights in these algorithms by standard rounding methods incurs drastic losses in performance. To reduce the resolution further, in the extreme case even to binary weights, more advanced techniques are necessary. To this end, we propose two methods for mapping neural network algorithms with high resolution weights to corresponding algorithms that work with low resolution weights and demonstrate that their performance is substantially better than standard rounding. We further use these methods to investigate the performance of three common neural network algorithms under fixed memory size of the weight matrix with different weight resolutions. We show that dedicated hardware systems, whose technology dictates very low weight resolutions (be they electronic or biological) could in principle implement the algorithms we study.