FOLT: Fast Multiple Object Tracking from UAV-captured Videos Based on Optical Flow
This addresses the problem of tracking objects in aerial videos for applications like surveillance or monitoring, but it is incremental as it builds on existing MOT methods with specific optimizations for UAV data.
The paper tackles the challenge of multiple object tracking in UAV-captured videos, where small object size and large motion cause difficulties, and proposes FOLT, which uses optical flow to improve detection and motion prediction, achieving state-of-the-art performance on Visdrone and UAVDT datasets.
Multiple object tracking (MOT) has been successfully investigated in computer vision. However, MOT for the videos captured by unmanned aerial vehicles (UAV) is still challenging due to small object size, blurred object appearance, and very large and/or irregular motion in both ground objects and UAV platforms. In this paper, we propose FOLT to mitigate these problems and reach fast and accurate MOT in UAV view. Aiming at speed-accuracy trade-off, FOLT adopts a modern detector and light-weight optical flow extractor to extract object detection features and motion features at a minimum cost. Given the extracted flow, the flow-guided feature augmentation is designed to augment the object detection feature based on its optical flow, which improves the detection of small objects. Then the flow-guided motion prediction is also proposed to predict the object's position in the next frame, which improves the tracking performance of objects with very large displacements between adjacent frames. Finally, the tracker matches the detected objects and predicted objects using a spatially matching scheme to generate tracks for every object. Experiments on Visdrone and UAVDT datasets show that our proposed model can successfully track small objects with large and irregular motion and outperform existing state-of-the-art methods in UAV-MOT tasks.