CTNet: Context-based Tandem Network for Semantic Segmentation
It addresses semantic segmentation, a key task in computer vision, by improving accuracy through better contextual modeling, but appears incremental as it builds on existing contextual approaches.
The paper tackles semantic segmentation by proposing CTNet, which interactively explores spatial and channel contextual information to discover semantic context, achieving superior performance on PASCAL-Context, ADE20K, and PASCAL VOC2012 datasets compared to state-of-the-art methods.
Contextual information has been shown to be powerful for semantic segmentation. This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information, which can discover the semantic context for semantic segmentation. Specifically, the Spatial Contextual Module (SCM) is leveraged to uncover the spatial contextual dependency between pixels by exploring the correlation between pixels and categories. Meanwhile, the Channel Contextual Module (CCM) is introduced to learn the semantic features including the semantic feature maps and class-specific features by modeling the long-term semantic dependence between channels. The learned semantic features are utilized as the prior knowledge to guide the learning of SCM, which can make SCM obtain more accurate long-range spatial dependency. Finally, to further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated to achieve better results. Extensive experiments are conducted on three widely-used datasets, i.e., PASCAL-Context, ADE20K and PASCAL VOC2012. The results demonstrate the superior performance of the proposed CTNet by comparison with several state-of-the-art methods.