Attention-guided Unified Network for Panoptic Segmentation
This work addresses the problem of panoptic segmentation for computer vision applications, offering a novel unified approach that improves accuracy over existing methods.
The paper tackles panoptic segmentation by proposing a unified framework that leverages foreground objects to assist background understanding, achieving state-of-the-art results with 46.5% PQ on MS-COCO and 59.0% PQ on Cityscapes.
This paper studies panoptic segmentation, a recently proposed task which segments foreground (FG) objects at the instance level as well as background (BG) contents at the semantic level. Existing methods mostly dealt with these two problems separately, but in this paper, we reveal the underlying relationship between them, in particular, FG objects provide complementary cues to assist BG understanding. Our approach, named the Attention-guided Unified Network (AUNet), is a unified framework with two branches for FG and BG segmentation simultaneously. Two sources of attentions are added to the BG branch, namely, RPN and FG segmentation mask to provide object-level and pixel-level attentions, respectively. Our approach is generalized to different backbones with consistent accuracy gain in both FG and BG segmentation, and also sets new state-of-the-arts both in the MS-COCO (46.5% PQ) and Cityscapes (59.0% PQ) benchmarks.