CVAug 17, 2021

Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision

Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia

arXiv:2108.07682v316.664 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the problem of high annotation costs and computational inefficiency in panoptic segmentation for computer vision applications, offering a novel method that is both effective and scalable.

The paper tackles panoptic segmentation by introducing Panoptic FCN, a fully convolutional framework that unifies foreground and background prediction using kernel generation and convolution, achieving high efficiency and outperforming previous models. It also proposes point-based weak supervision, reducing annotation costs to 20 random points per instance while maintaining 82% of fully-supervised performance.

In this paper, we present a conceptually simple, strong, and efficient framework for fully- and weakly-supervised panoptic segmentation, called Panoptic FCN. Our approach aims to represent and predict foreground things and background stuff in a unified fully convolutional pipeline, which can be optimized with point-based fully or weak supervision. In particular, Panoptic FCN encodes each object instance or stuff category with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly. With this approach, instance-aware and semantically consistent properties for things and stuff can be respectively satisfied in a simple generate-kernel-then-segment workflow. Without extra boxes for localization or instance separation, the proposed approach outperforms the previous box-based and -free models with high efficiency. Furthermore, we propose a new form of point-based annotation for weakly-supervised panoptic segmentation. It only needs several random points for both things and stuff, which dramatically reduces the annotation cost of human. The proposed Panoptic FCN is also proved to have much superior performance in this weakly-supervised setting, which achieves 82% of the fully-supervised performance with only 20 randomly annotated points per instance. Extensive experiments demonstrate the effectiveness and efficiency of Panoptic FCN on COCO, VOC 2012, Cityscapes, and Mapillary Vistas datasets. And it sets up a new leading benchmark for both fully- and weakly-supervised panoptic segmentation. Our code and models are made publicly available at https://github.com/dvlab-research/PanopticFCN.

View on arXiv PDF Code

Similar