CVMar 8, 2017

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network

arXiv:1703.02719v11605 citations
Originality Highly original
AI Analysis

This addresses the challenge of dense per-pixel prediction in semantic segmentation for computer vision applications, representing a strong specific gain.

The paper tackles the problem of semantic segmentation by proposing a Global Convolutional Network that uses large kernels to improve classification and localization, achieving state-of-the-art performance with 82.2% on PASCAL VOC 2012 and 76.9% on Cityscapes.

One of recent trends [30, 31, 14] in network architec- ture design is stacking small filters (e.g., 1x1 or 3x3) in the entire network because the stacked small filters is more ef- ficient than a large kernel, given the same computational complexity. However, in the field of semantic segmenta- tion, where we need to perform dense per-pixel prediction, we find that the large kernel (and effective receptive field) plays an important role when we have to perform the clas- sification and localization tasks simultaneously. Following our design principle, we propose a Global Convolutional Network to address both the classification and localization issues for the semantic segmentation. We also suggest a residual-based boundary refinement to further refine the ob- ject boundaries. Our approach achieves state-of-art perfor- mance on two public benchmarks and significantly outper- forms previous results, 82.2% (vs 80.2%) on PASCAL VOC 2012 dataset and 76.9% (vs 71.8%) on Cityscapes dataset.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes