Fully Quantized Image Super-Resolution Networks
This work addresses the challenge of developing accurate, real-time, and energy-efficient image Super-Resolution inference methods for intelligent mobile devices by improving model quantization techniques.
This paper proposes a Fully Quantized image Super-Resolution (FQSR) framework that enables end-to-end quantization of all layers, including skip connections, in SR networks. The FQSR framework achieves performance on par with full-precision models on five benchmark datasets while significantly reducing computational cost and memory consumption compared to state-of-the-art quantized SR methods.
With the rising popularity of intelligent mobile devices, it is of great practical significance to develop accurate, realtime and energy-efficient image Super-Resolution (SR) inference methods. A prevailing method for improving the inference efficiency is model quantization, which allows for replacing the expensive floating-point operations with efficient fixed-point or bitwise arithmetic. To date, it is still challenging for quantized SR frameworks to deliver feasible accuracy-efficiency trade-off. Here, we propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy. In particular, we target on obtaining end-to-end quantized models for all layers, especially including skip connections, which was rarely addressed in the literature. We further identify training obstacles faced by low-bit SR networks and propose two novel methods accordingly. The two difficulites are caused by 1) activation and weight distributions being vastly distinctive in different layers; 2) the inaccurate approximation of the quantization. We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR. Experimental results show that our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets and surpass state-of-the-art quantized SR methods with significantly reduced computational cost and memory consumption.