CVApr 9, 2020

CenterMask: single shot instance segmentation with point representation

arXiv:2004.04446v20.0080 citations
AI Analysis55

This addresses the problem of fast and accurate instance segmentation for computer vision applications, offering a competitive incremental improvement over existing one-stage methods.

The paper tackles single-shot instance segmentation by decomposing it into local shape prediction and global saliency generation, achieving 34.5 mask AP at 12.3 fps on COCO, outperforming most one-stage methods except slower ones like TensorMask.

In this paper, we propose a single-shot instance segmentation method, which is simple, fast and accurate. There are two main challenges for one-stage instance segmentation: object instances differentiation and pixel-wise feature alignment. Accordingly, we decompose the instance segmentation into two parallel subtasks: Local Shape prediction that separates instances even in overlapping conditions, and Global Saliency generation that segments the whole image in a pixel-to-pixel manner. The outputs of the two branches are assembled to form the final instance masks. To realize that, the local shape information is adopted from the representation of object center points. Totally trained from scratch and without any bells and whistles, the proposed CenterMask achieves 34.5 mask AP with a speed of 12.3 fps, using a single-model with single-scale training/testing on the challenging COCO dataset. The accuracy is higher than all other one-stage instance segmentation methods except the 5 times slower TensorMask, which shows the effectiveness of CenterMask. Besides, our method can be easily embedded to other one-stage object detectors such as FCOS and performs well, showing the generalization of CenterMask.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes