CVNov 2, 2023

CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking and Segmentation

arXiv:2311.00987v124 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses video instance segmentation for applications like autonomous driving and smart retail, but it appears incremental as it builds on existing multi-task learning approaches.

The paper tackles the problem of simultaneously performing object detection, instance segmentation, and multi-object tracking in videos by proposing a collaborative multi-task learning framework with associative connections between task heads. It achieves encouraging results on KITTI MOTS and MOTS Challenge datasets.

The advancement of computer vision has pushed visual analysis tasks from still images to the video domain. In recent years, video instance segmentation, which aims to track and segment multiple objects in video frames, has drawn much attention for its potential applications in various emerging areas such as autonomous driving, intelligent transportation, and smart retail. In this paper, we propose an effective framework for instance-level visual analysis on video frames, which can simultaneously conduct object detection, instance segmentation, and multi-object tracking. The core idea of our method is collaborative multi-task learning which is achieved by a novel structure, named associative connections among detection, segmentation, and tracking task heads in an end-to-end learnable CNN. These additional connections allow information propagation across multiple related tasks, so as to benefit these tasks simultaneously. We evaluate the proposed method extensively on KITTI MOTS and MOTS Challenge datasets and obtain quite encouraging results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes