CVSep 19, 2018

Combined Image- and World-Space Tracking in Traffic Scenes

Aljosa Osep, Wolfgang Mehner, Markus Mathias, Bastian Leibe

arXiv:1809.07357v115.8137 citations

Originality Incremental advance

AI Analysis

This addresses tracking for autonomous systems like self-driving cars, but it is incremental as it builds on existing methods by integrating 3D information more thoroughly.

The paper tackles the problem of tracking objects in traffic scenes by proposing a method that jointly uses image- and world-space information throughout the pipeline, matching state-of-the-art on the KITTI benchmark in 2D and showing significant improvements in 3D localization precision.

Tracking in urban street scenes plays a central role in autonomous systems such as self-driving cars. Most of the current vision-based tracking methods perform tracking in the image domain. Other approaches, eg based on LIDAR and radar, track purely in 3D. While some vision-based tracking methods invoke 3D information in parts of their pipeline, and some 3D-based methods utilize image-based information in components of their approach, we propose to use image- and world-space information jointly throughout our method. We present our tracking pipeline as a 3D extension of image-based tracking. From enhancing the detections with 3D measurements to the reported positions of every tracked object, we use world-space 3D information at every stage of processing. We accomplish this by our novel coupled 2D-3D Kalman filter, combined with a conceptually clean and extendable hypothesize-and-select framework. Our approach matches the current state-of-the-art on the official KITTI benchmark, which performs evaluation in the 2D image domain only. Further experiments show significant improvements in 3D localization precision by enabling our coupled 2D-3D tracking.

View on arXiv PDF

Similar