CVMay 25, 2018

Pyramid Attention Network for Semantic Segmentation

arXiv:1805.10180v3999 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving semantic segmentation accuracy for computer vision applications, representing an incremental advancement over existing methods.

The paper tackles semantic segmentation by proposing a Pyramid Attention Network (PAN) that combines attention mechanisms with spatial pyramids to extract precise dense features, achieving state-of-the-art performance with 84.0% mIoU on PASCAL VOC 2012 without using COCO data.

A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation. Different from most existing works, we combine attention mechanism and spatial pyramid to extract precise dense features for pixel labeling instead of complicated dilated convolution and artificially designed decoder networks. Specifically, we introduce a Feature Pyramid Attention module to perform spatial pyramid attention structure on high-level output and combining global pooling to learn a better feature representation, and a Global Attention Upsample module on each decoder layer to provide global context as a guidance of low-level features to select category localization details. The proposed approach achieves state-of-the-art performance on PASCAL VOC 2012 and Cityscapes benchmarks with a new record of mIoU accuracy 84.0% on PASCAL VOC 2012, while training without COCO dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes