Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID
It provides a strong baseline for multi-UAV tracking tasks, addressing challenges like low contrast and small targets in thermal infrared video, but is incremental as it builds on existing detection and tracking methods.
This paper tackles multi-UAV tracking in thermal infrared video by proposing a framework based on YOLOv12 and BoT-SORT, achieving competitive performance on the 4th Anti-UAV Challenge metrics without using contrast enhancement or temporal fusion.
Detecting and tracking multiple unmanned aerial vehicles (UAVs) in thermal infrared video is inherently challenging due to low contrast, environmental noise, and small target sizes. This paper provides a straightforward approach to address multi-UAV tracking in thermal infrared video, leveraging recent advances in detection and tracking. Instead of relying on the well-established YOLOv5 with DeepSORT combination, we present a tracking framework built on YOLOv12 and BoT-SORT, enhanced with tailored training and inference strategies. We evaluate our approach following the 4th Anti-UAV Challenge metrics and reach competitive performance. Notably, we achieved strong results without using contrast enhancement or temporal information fusion to enrich UAV features, highlighting our approach as a "Strong Baseline" for multi-UAV tracking tasks. We provide implementation details, in-depth experimental analysis, and a discussion of potential improvements. The code is available at https://github.com/wish44165/YOLOv12-BoT-SORT-ReID .