Gated Path Selection Network for Semantic Segmentation
This work addresses the problem of capturing diverse semantic contexts in segmentation for computer vision applications, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the challenge of handling scale variations and deformations in semantic segmentation by developing Gated Path Selection Network (GPSNet) that learns adaptive receptive fields, achieving competitive performance on Cityscapes and ADE20K datasets.
Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints. In this paper, we develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields. In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields. To dynamically select desirable semantic context, a gate prediction module is further introduced. In contrast to previous works that focus on optimizing sample positions on the regular grids, GPSNet can adaptively capture free form dense semantic contexts. The derived adaptive receptive fields are data-dependent, and are flexible that can model different object geometric transformations. On two representative semantic segmentation datasets, i.e., Cityscapes, and ADE20K, we show that the proposed approach consistently outperforms previous methods and achieves competitive performance without bells and whistles.