MWQ: Multiscale Wavelet Quantized Neural Networks
This addresses model deployment on resource-constrained hardware by enhancing quantization techniques, though it appears incremental as it builds on existing frequency-domain insights.
The paper tackles performance degradation in quantized neural networks by proposing a multiscale wavelet quantization method that decomposes data into frequency components to reduce information loss, demonstrating applications like model compression on ImageNet and COCO datasets with improved representation ability.
Model quantization can reduce the model size and computational latency, it has become an essential technique for the deployment of deep neural networks on resourceconstrained hardware (e.g., mobile phones and embedded devices). The existing quantization methods mainly consider the numerical elements of the weights and activation values, ignoring the relationship between elements. The decline of representation ability and information loss usually lead to the performance degradation. Inspired by the characteristics of images in the frequency domain, we propose a novel multiscale wavelet quantization (MWQ) method. This method decomposes original data into multiscale frequency components by wavelet transform, and then quantizes the components of different scales, respectively. It exploits the multiscale frequency and spatial information to alleviate the information loss caused by quantization in the spatial domain. Because of the flexibility of MWQ, we demonstrate three applications (e.g., model compression, quantized network optimization, and information enhancement) on the ImageNet and COCO datasets. Experimental results show that our method has stronger representation ability and can play an effective role in quantized neural networks.