RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks
This addresses the challenge of efficient neural network deployment for resource-constrained environments, though it is incremental as it builds on existing quantization techniques.
The paper tackles the problem of quantizing neural network weights to binary and ternary values by introducing Random Partition Relaxation (RPR), a method that achieves state-of-the-art accuracy for GoogLeNet and competitive performance for ResNet-18 and ResNet-50.
We present Random Partition Relaxation (RPR), a method for strong quantization of neural networks weight to binary (+1/-1) and ternary (+1/0/-1) values. Starting from a pre-trained model, we quantize the weights and then relax random partitions of them to their continuous values for retraining before re-quantizing them and switching to another weight partition for further adaptation. We demonstrate binary and ternary-weight networks with accuracies beyond the state-of-the-art for GoogLeNet and competitive performance for ResNet-18 and ResNet-50 using an SGD-based training method that can easily be integrated into existing frameworks.