3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation
This work addresses the need for efficient and accurate 3D object detection and tracking in autonomous driving and robotics, offering a novel integration of detection and tracking but is incremental in combining existing concepts.
The paper tackles the problem of 3D object detection and tracking from LiDAR data by proposing 3D-FCT, a Siamese network that uses temporal information to simultaneously perform both tasks, resulting in a 5.57% mAP improvement over a state-of-the-art method on the KITTI tracking dataset.
3D object detection using LiDAR data remains a key task for applications like autonomous driving and robotics. Unlike in the case of 2D images, LiDAR data is almost always collected over a period of time. However, most work in this area has focused on performing detection independent of the temporal domain. In this paper we present 3D-FCT, a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking. The network is trained to predict the movement of an object based on the correlation features of extracted keypoints across time. Calculating correlation across keypoints only allows for real-time object detection. We further extend the multi-task objective to include a tracking regression loss. Finally, we produce high accuracy detections by linking short-term object tracklets into long term tracks based on the predicted tracks. Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.