CVJul 19, 2020

Resolution Switchable Networks for Runtime Efficient Image Recognition

arXiv:2007.09558v332 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the need for flexible computational resource usage in image recognition, though it is incremental as it builds on existing methods like batch normalization and knowledge distillation.

The authors tackled the problem of enabling a single convolutional neural network to switch image resolutions at inference for runtime efficiency, achieving accuracy improvements across a wide range of resolutions compared to individually trained models on the ImageNet dataset.

We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference. Thus the running speed can be selected to meet various computational resource limits. Networks trained with the proposed method are named Resolution Switchable Networks (RS-Nets). The basic training framework shares network parameters for handling images which differ in resolution, yet keeps separate batch normalization layers. Though it is parameter-efficient in design, it leads to inconsistent accuracy variations at different resolutions, for which we provide a detailed analysis from the aspect of the train-test recognition discrepancy. A multi-resolution ensemble distillation is further designed, where a teacher is learnt on the fly as a weighted ensemble over resolutions. Thanks to the ensemble and knowledge distillation, RS-Nets enjoy accuracy improvements at a wide range of resolutions compared with individually trained models. Extensive experiments on the ImageNet dataset are provided, and we additionally consider quantization problems. Code and models are available at https://github.com/yikaiw/RS-Nets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes