CVAug 5, 2022

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation

arXiv:2208.03079v212 citationsh-index: 70Has Code
AI Analysis

This work addresses the problem of effectively incorporating temporal information into online models for video instance segmentation, which is incremental as it builds on existing frameworks but introduces a novel paradigm.

The paper tackles video instance segmentation by proposing a new online paradigm called Instance As Identity (IAI), which models temporal information for detection and tracking efficiently, achieving state-of-the-art results on benchmarks like YouTube-VIS-2019 (43.7 mAP), YouTube-VIS-2021 (38.0 mAP), and OVIS (20.6 mAP).

Modeling temporal information for both detection and tracking in a unified framework has been proved a promising solution to video instance segmentation (VIS). However, how to effectively incorporate the temporal information into an online model remains an open problem. In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way. In detail, IAI employs a novel identification module to predict identification number for tracking instances explicitly. For passing temporal information cross frame, IAI utilizes an association module which combines current features and past embeddings. Notably, IAI can be integrated with different image models. We conduct extensive experiments on three VIS benchmarks. IAI outperforms all the online competitors on YouTube-VIS-2019 (ResNet-101 43.7 mAP) and YouTube-VIS-2021 (ResNet-50 38.0 mAP). Surprisingly, on the more challenging OVIS, IAI achieves SOTA performance (20.6 mAP). Code is available at https://github.com/zfonemore/IAI

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes