CVApr 3, 2018

Multi-Scale Spatially-Asymmetric Recalibration for Image Classification

arXiv:1804.00787v115 citations
Originality Incremental advance
AI Analysis

This addresses a specific problem in computer vision for image classification, offering an incremental improvement by enhancing contextual feature utilization in existing architectures.

The paper tackled the limitation of spatially-symmetric convolution in image classification by introducing multi-scale spatially-asymmetric recalibration (MS-SAR), which improved performance on CIFAR and ILSVRC2012 tasks with minimal extra parameters and computations.

Convolution is spatially-symmetric, i.e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition. This paper addresses this issue by introducing a recalibration process, which refers to the surrounding region of each neuron, computes an importance value and multiplies it to the original neural response. Our approach is named multi-scale spatially-asymmetric recalibration (MS-SAR), which extracts visual cues from surrounding regions at multiple scales, and designs a weighting scheme which is asymmetric in the spatial domain. MS-SAR is implemented in an efficient way, so that only small fractions of extra parameters and computations are required. We apply MS-SAR to several popular building blocks, including the residual block and the densely-connected block, and demonstrate its superior performance in both CIFAR and ILSVRC2012 classification tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes