HPTQ: Hardware-Friendly Post Training Quantization
This work addresses the need for efficient model deployment on edge devices, though it is incremental as it combines existing quantization methods.
The paper tackled the problem of neural network quantization for edge devices by introducing HPTQ, a hardware-friendly post-training quantization framework that supports uniform, symmetric, and power-of-two thresholds, achieving competitive results across tasks like classification and object detection.
Neural network quantization enables the deployment of models on edge devices. An essential requirement for their hardware efficiency is that the quantizers are hardware-friendly: uniform, symmetric, and with power-of-two thresholds. To the best of our knowledge, current post-training quantization methods do not support all of these constraints simultaneously. In this work, we introduce a hardware-friendly post training quantization (HPTQ) framework, which addresses this problem by synergistically combining several known quantization methods. We perform a large-scale study on four tasks: classification, object detection, semantic segmentation and pose estimation over a wide variety of network architectures. Our extensive experiments show that competitive results can be obtained under hardware-friendly constraints.