CVMar 8, 2017

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network

Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun

arXiv:1703.02719v129.61605 citations

Originality Highly original

AI Analysis

This addresses the challenge of dense per-pixel prediction in semantic segmentation for computer vision applications, representing a strong specific gain.

The paper tackles the problem of semantic segmentation by proposing a Global Convolutional Network that uses large kernels to improve classification and localization, achieving state-of-the-art performance with 82.2% on PASCAL VOC 2012 and 76.9% on Cityscapes.

One of recent trends [30, 31, 14] in network architec- ture design is stacking small filters (e.g., 1x1 or 3x3) in the entire network because the stacked small filters is more ef- ficient than a large kernel, given the same computational complexity. However, in the field of semantic segmenta- tion, where we need to perform dense per-pixel prediction, we find that the large kernel (and effective receptive field) plays an important role when we have to perform the clas- sification and localization tasks simultaneously. Following our design principle, we propose a Global Convolutional Network to address both the classification and localization issues for the semantic segmentation. We also suggest a residual-based boundary refinement to further refine the ob- ject boundaries. Our approach achieves state-of-art perfor- mance on two public benchmarks and significantly outper- forms previous results, 82.2% (vs 80.2%) on PASCAL VOC 2012 dataset and 76.9% (vs 71.8%) on Cityscapes dataset.

View on arXiv PDF

Similar