CVAug 12, 2023

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning

Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan, Weian Mao, Chenchen Jing, Yifan Liu, Chunhua Shen

CMU

arXiv:2308.06531v128 citationsh-index: 81

Originality Incremental advance

AI Analysis

It addresses the limitation of closed-set models in detecting novel objects for computer vision applications, though it is incremental by building on prior open-world approaches.

The paper tackles the problem of open-world instance segmentation by proposing SegPrompt, a training mechanism that uses category information to improve class-agnostic segmentation for known and unknown objects, resulting in improvements of 5.6% and 6.1% in AR on a new benchmark.

Current closed-set instance segmentation models rely on pre-defined class labels for each mask during training and evaluation, largely limiting their ability to detect novel objects. Open-world instance segmentation (OWIS) models address this challenge by detecting unknown objects in a class-agnostic manner. However, previous OWIS approaches completely erase category information during training to keep the model's ability to generalize to unknown objects. In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories. In addition, the previous OWIS training setting exposes the unknown classes to the training set and brings information leakage, which is unreasonable in the real world. Therefore, we provide a new open-world benchmark closer to a real-world scenario by dividing the dataset classes into known-seen-unseen parts. For the first time, we focus on the model's ability to discover objects that never appear in the training set images. Experiments show that SegPrompt can improve the overall and unseen detection performance by 5.6% and 6.1% in AR on our new benchmark without affecting the inference efficiency. We further demonstrate the effectiveness of our method on existing cross-dataset transfer and strongly supervised settings, leading to 5.5% and 12.3% relative improvement.

View on arXiv PDF

Similar