CV LG MLMay 7, 2019

High Frequency Residual Learning for Multi-Scale Image Classification

Bowen Cheng, Rong Xiao, Jianfeng Wang, Thomas Huang, Lei Zhang

arXiv:1905.02649v14.123 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency and accuracy trade-offs in image classification for resource-constrained devices, presenting an incremental improvement over existing architectures.

The paper tackles the problem of efficient multi-scale image classification for mobile and embedded vision by proposing a high frequency residual learning framework, resulting in accuracy gains of 1.5% on ResNet-18 and MobileNet and 3.8% on a more efficient MobileNet variant without increasing computations on ImageNet-1k.

We present a novel high frequency residual learning framework, which leads to a highly efficient multi-scale network (MSNet) architecture for mobile and embedded vision problems. The architecture utilizes two networks: a low resolution network to efficiently approximate low frequency components and a high resolution network to learn high frequency residuals by reusing the upsampled low resolution features. With a classifier calibration module, MSNet can dynamically allocate computation resources during inference to achieve a better speed and accuracy trade-off. We evaluate our methods on the challenging ImageNet-1k dataset and observe consistent improvements over different base networks. On ResNet-18 and MobileNet with alpha=1.0, MSNet gains 1.5% accuracy over both architectures without increasing computations. On the more efficient MobileNet with alpha=0.25, our method gains 3.8% accuracy with the same amount of computations.

View on arXiv PDF

Similar