Track to Detect and Segment: An Online Multi-Object Tracker
This work addresses the challenge of improving object detection and segmentation in real-time tracking for applications such as autonomous driving and video analysis, representing a novel integration rather than an incremental step.
The paper tackles the problem of online multi-object tracking by introducing TraDeS, a joint detection and tracking model that uses tracking clues to enhance detection and segmentation, achieving state-of-the-art results on datasets like MOT and nuScenes.
Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. In this paper, we present a new online joint detection and tracking model, TraDeS (TRAck to DEtect and Segment), exploiting tracking clues to assist detection end-to-end. TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features for improving current object detection and segmentation. Effectiveness and superiority of TraDeS are shown on 4 datasets, including MOT (2D tracking), nuScenes (3D tracking), MOTS and Youtube-VIS (instance segmentation tracking). Project page: https://jialianwu.com/projects/TraDeS.html.