TopTrack: Tracking Objects By Their Top
This addresses tracking accuracy in crowded scenes for applications like surveillance or autonomous driving, but it is incremental as it modifies an existing paradigm.
The paper tackles the problem of missed detections in multi-object tracking (MOT) due to object center keypoints being occluded in crowded scenarios, proposing TopTrack which uses the object top keypoint instead and achieves competitive results on two MOT benchmarks.
In recent years, the joint detection-and-tracking paradigm has been a very popular way of tackling the multi-object tracking (MOT) task. Many of the methods following this paradigm use the object center keypoint for detection. However, we argue that the center point is not optimal since it is often not visible in crowded scenarios, which results in many missed detections when the objects are partially occluded. We propose TopTrack, a joint detection-and-tracking method that uses the top of the object as a keypoint for detection instead of the center because it is more often visible. Furthermore, TopTrack processes consecutive frames in separate streams in order to facilitate training. We performed experiments to show that using the object top as a keypoint for detection can reduce the amount of missed detections, which in turn leads to more complete trajectories and less lost trajectories. TopTrack manages to achieve competitive results with other state-of-the-art trackers on two MOT benchmarks.