CVJun 7, 2022

Localizing Semantic Patches for Accelerating Image Classification

arXiv:2206.03367v12 citationsh-index: 26Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency in image classification for computer vision applications, offering an incremental improvement over existing dynamic inference methods.

The paper tackles the problem of spatial redundancy in image classification by localizing task-aware regions with a lightweight patch proposal network, achieving state-of-the-art performance on ImageNet with reduced inference costs.

Existing works often focus on reducing the architecture redundancy for accelerating image classification but ignore the spatial redundancy of the input image. This paper proposes an efficient image classification pipeline to solve this problem. We first pinpoint task-aware regions over the input image by a lightweight patch proposal network called AnchorNet. We then feed these localized semantic patches with much smaller spatial redundancy into a general classification network. Unlike the popular design of deep CNN, we aim to carefully design the Receptive Field of AnchorNet without intermediate convolutional paddings. This ensures the exact mapping from a high-level spatial location to the specific input image patch. The contribution of each patch is interpretable. Moreover, AnchorNet is compatible with any downstream architecture. Experimental results on ImageNet show that our method outperforms SOTA dynamic inference methods with fewer inference costs. Our code is available at https://github.com/winycg/AnchorNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes