CV AI LG ROFeb 7, 2023

Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking

Ziqi Pang, Jie Li, Pavel Tokmakov, Dian Chen, Sergey Zagoruyko, Yu-Xiong Wang

arXiv:2302.03802v223.877 citationsh-index: 45Has Code

Originality Highly original

AI Analysis

This work addresses the problem of robust 3D tracking in autonomous driving scenarios, offering a novel method that significantly enhances performance metrics.

The paper tackles multi-camera 3D multi-object tracking by proposing PF-Track, an end-to-end framework that integrates past and future reasoning for spatio-temporal continuity, resulting in a large improvement in AMOTA and a 90% reduction in ID-Switches on the nuScenes dataset.

This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT) framework. It emphasizes spatio-temporal continuity and integrates both past and future reasoning for tracked objects. Thus, we name it "Past-and-Future reasoning for Tracking" (PF-Track). Specifically, our method adapts the "tracking by attention" framework and represents tracked instances coherently over time with object queries. To explicitly use historical cues, our "Past Reasoning" module learns to refine the tracks and enhance the object features by cross-attending to queries from previous frames and other objects. The "Future Reasoning" module digests historical information and predicts robust future trajectories. In the case of long-term occlusions, our method maintains the object positions and enables re-association by integrating motion predictions. On the nuScenes dataset, our method improves AMOTA by a large margin and remarkably reduces ID-Switches by 90% compared to prior approaches, which is an order of magnitude less. The code and models are made available at https://github.com/TRI-ML/PF-Track.

View on arXiv PDF Code

Similar