CVAILGJul 23, 2022

SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling

arXiv:2207.11511v12 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses efficiency in visual recognition for applications like image classification and object detection, but it is incremental as it builds on existing networks like ResNet.

The paper tackles the problem of inefficient visual recognition by proposing SSBNet, which uses adaptive sampling in building blocks to improve efficiency, achieving a 0.6% higher accuracy on ImageNet compared to a baseline with similar complexity.

Downsampling is widely adopted to achieve a good trade-off between accuracy and latency for visual recognition. Unfortunately, the commonly used pooling layers are not learned, and thus cannot preserve important information. As another dimension reduction method, adaptive sampling weights and processes regions that are relevant to the task, and is thus able to better preserve useful information. However, the use of adaptive sampling has been limited to certain layers. In this paper, we show that using adaptive sampling in the building blocks of a deep neural network can improve its efficiency. In particular, we propose SSBNet which is built by inserting sampling layers repeatedly into existing networks like ResNet. Experiment results show that the proposed SSBNet can achieve competitive image classification and object detection performance on ImageNet and COCO datasets. For example, the SSB-ResNet-RS-200 achieved 82.6% accuracy on ImageNet dataset, which is 0.6% higher than the baseline ResNet-RS-152 with a similar complexity. Visualization shows the advantage of SSBNet in allowing different layers to focus on different positions, and ablation studies further validate the advantage of adaptive sampling over uniform methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes