Tracklet Association by Online Target-Specific Metric Learning and Coherent Dynamics Estimation
This addresses the challenge of maintaining consistent identities in crowded or occluded scenes for multi-person tracking applications, representing an incremental improvement over existing methods.
The paper tackles the problem of identity switches and missed detections in long-term multi-person tracking by developing a method that learns target-specific appearance metrics and motion dynamics online during tracking. Experimental results show it outperforms several state-of-the-art tracking methods on public datasets.
In this paper, we present a novel method based on online target-specific metric learning and coherent dynamics estimation for tracklet (track fragment) association by network flow optimization in long-term multi-person tracking. Our proposed framework aims to exploit appearance and motion cues to prevent identity switches during tracking and to recover missed detections. Furthermore, target-specific metrics (appearance cue) and motion dynamics (motion cue) are proposed to be learned and estimated online, i.e. during the tracking process. Our approach is effective even when such cues fail to identify or follow the target due to occlusions or object-to-object interactions. We also propose to learn the weights of these two tracking cues to handle the difficult situations, such as severe occlusions and object-to-object interactions effectively. Our method has been validated on several public datasets and the experimental results show that it outperforms several state-of-the-art tracking methods.