CVSep 8, 2019

Squeeze-and-Attention Networks for Semantic Segmentation

arXiv:1909.03402v4281 citations
Originality Incremental advance
AI Analysis

This work improves semantic segmentation accuracy for computer vision applications, representing an incremental advancement with specific gains.

The paper tackled semantic segmentation by proposing a squeeze-and-attention network (SANet) that addresses limitations in existing attention mechanisms, achieving 83.2% mIoU on PASCAL VOC and a state-of-the-art 54.4% mIoU on PASCAL Context.

The recent integration of attention mechanisms into segmentation networks improves their representational capabilities through a great emphasis on more informative features. However, these attention mechanisms ignore an implicit sub-task of semantic segmentation and are constrained by the grid structure of convolution kernels. In this paper, we propose a novel squeeze-and-attention network (SANet) architecture that leverages an effective squeeze-and-attention (SA) module to account for two distinctive characteristics of segmentation: i) pixel-group attention, and ii) pixel-wise prediction. Specifically, the proposed SA modules impose pixel-group attention on conventional convolution by introducing an 'attention' convolutional channel, thus taking into account spatial-channel inter-dependencies in an efficient manner. The final segmentation results are produced by merging outputs from four hierarchical stages of a SANet to integrate multi-scale contexts for obtaining an enhanced pixel-wise prediction. Empirical experiments on two challenging public datasets validate the effectiveness of the proposed SANets, which achieves 83.2% mIoU (without COCO pre-training) on PASCAL VOC and a state-of-the-art mIoU of 54.4% on PASCAL Context.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes