Proximal Mean-field for Neural Network Quantization
This work addresses the problem of compressing neural networks for reduced memory and time complexity, which is crucial for deployment in resource-constrained environments, though it appears incremental by applying existing MRF optimization techniques to quantization.
The paper tackles neural network quantization by framing it as a discrete labeling problem and proposes an efficient iterative optimization method based on projected gradient descent, which is shown to be equivalent to a proximal mean-field approach. Experiments on standard datasets like MNIST, CIFAR10/100, and TinyImageNet demonstrate that the algorithm achieves fully-quantized networks with accuracies very close to floating-point reference networks.
Compressing large Neural Networks (NN) by quantizing the parameters, while maintaining the performance is highly desirable due to reduced memory and time complexity. In this work, we cast NN quantization as a discrete labelling problem, and by examining relaxations, we design an efficient iterative optimization procedure that involves stochastic gradient descent followed by a projection. We prove that our simple projected gradient descent approach is, in fact, equivalent to a proximal version of the well-known mean-field method. These findings would allow the decades-old and theoretically grounded research on MRF optimization to be used to design better network quantization schemes. Our experiments on standard classification datasets (MNIST, CIFAR10/100, TinyImageNet) with convolutional and residual architectures show that our algorithm obtains fully-quantized networks with accuracies very close to the floating-point reference networks.