Bit-wise Training of Neural Network Weights
This method offers a novel approach for efficient neural network training and storage, potentially benefiting researchers and practitioners in machine learning by reducing computational and memory overhead, though it appears incremental in its application to existing architectures.
The paper tackles the problem of training neural networks by learning individual bits of weights, enabling integer-valued weights at arbitrary bit-depths and naturally inducing sparsity without extra constraints. It achieves better results than standard training for fully connected networks and similar performance for convolutional and residual networks, with over 90% of the network usable for storing arbitrary codes without accuracy loss.
We introduce an algorithm where the individual bits representing the weights of a neural network are learned. This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks, without additional constraints or regularization techniques. We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks. By training bits in a selective manner we found that the biggest contribution to achieving high accuracy is given by the first three most significant bits, while the rest provide an intrinsic regularization. As a consequence more than 90\% of a network can be used to store arbitrary codes without affecting its accuracy. These codes may be random noise, binary files or even the weights of previously trained networks.