CVOct 8, 2021

An End-to-End Trainable Video Panoptic Segmentation Method usingTransformers

arXiv:2110.04009v12 citations
Originality Incremental advance
AI Analysis

This addresses the problem of generating instance tracking IDs and segmentation results across video sequences for computer vision applications, representing an incremental advancement.

The paper tackles video panoptic segmentation by unifying panoptic segmentation and multi-object tracking, achieving 57.81% on KITTI-STEP and 31.8% on MOTChallenge-STEP datasets.

In this paper, we present an algorithm to tackle a video panoptic segmentation problem, a newly emerging area of research. The video panoptic segmentation is a task that unifies the typical task of panoptic segmentation and multi-object tracking. In other words, it requires generating the instance tracking IDs along with panoptic segmentation results across video sequences. Our proposed video panoptic segmentation algorithm uses the transformer and it can be trained in end-to-end with an input of multiple video frames. We test our method on the STEP dataset and report its performance with recently proposed STQ metric. The method archived 57.81\% on the KITTI-STEP dataset and 31.8\% on the MOTChallenge-STEP dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes