LGNov 9, 2023

RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Anastasiia Prutianova, Alexey Zaytsev, Chung-Kuei Lee, Fengyu Sun, Ivan Koryakovskiy

arXiv:2311.05317v12.0h-index: 6

Originality Incremental advance

AI Analysis

This addresses the challenge of deploying memory-consuming and computationally intensive neural networks in resource-constrained environments, representing an incremental advancement by combining two existing techniques.

The paper tackles the problem of applying quantization to re-parametrized neural networks to improve efficiency, proposing RepQ, which outperforms the LSQ baseline in all experiments.

Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are quantization, a well-known approach for network compression, and re-parametrization, an emerging technique designed to improve model performance. Although both techniques have been studied individually, there has been limited research on their simultaneous application. To address this gap, we propose a novel approach called RepQ, which applies quantization to re-parametrized networks. Our method is based on the insight that the test stage weights of an arbitrary re-parametrized layer can be presented as a differentiable function of trainable parameters. We enable quantization-aware training by applying quantization on top of this function. RepQ generalizes well to various re-parametrized models and outperforms the baseline method LSQ quantization scheme in all experiments.

View on arXiv PDF

Similar