Instance Flow Based Online Multiple Object Tracking
This work addresses tracking objects with high motion in video for applications like surveillance or autonomous driving, but it is incremental as it builds on existing segmentation and flow methods.
The paper tackles online Multiple Object Tracking (MOT) in monocular video by using instance-aware semantic segmentation and optical flow to predict object positions and shapes, achieving a MOTA score of 32.1 on the MOT 2D 2015 test set.
We present a method to perform online Multiple Object Tracking (MOT) of known object categories in monocular video data. Current Tracking-by-Detection MOT approaches build on top of 2D bounding box detections. In contrast, we exploit state-of-the-art instance aware semantic segmentation techniques to compute 2D shape representations of target objects in each frame. We predict position and shape of segmented instances in subsequent frames by exploiting optical flow cues. We define an affinity matrix between instances of subsequent frames which reflects locality and visual similarity. The instance association is solved by applying the Hungarian method. We evaluate different configurations of our algorithm using the MOT 2D 2015 train dataset. The evaluation shows that our tracking approach is able to track objects with high relative motions. In addition, we provide results of our approach on the MOT 2D 2015 test set for comparison with previous works. We achieve a MOTA score of 32.1.