UAVDB: Point-Guided Masks for UAV Detection and Segmentation
This work addresses the need for scalable and high-resolution datasets for UAV detection in surveillance and security applications, though it is incremental as it builds on existing annotation methods.
The authors tackled the problem of limited datasets for UAV detection and segmentation by introducing UAVDB, a benchmark dataset constructed using a point-guided weak supervision pipeline, which outperforms existing annotation techniques in IoU metrics.
Accurate detection of Unmanned Aerial Vehicles (UAVs) is critical for surveillance, security, and airspace monitoring. However, existing datasets remain limited in scale, resolution, and the ability to capture objects across extreme size variations. To address these challenges, we present UAVDB, a benchmark dataset for UAV detection and segmentation, constructed via a point-guided weak supervision pipeline. We introduce Patch Intensity Convergence (PIC), a lightweight annotation method that converts trajectory points into bounding boxes, eliminating the need for manual labeling while preserving precise spatial localization. Building upon these annotations, we further generate segmentation masks using SAM2, enriching the dataset with multi-task labels. UAVDB consists of RGB frames from a fixed-camera multi-view video dataset, capturing UAVs across scales ranging from clearly visible objects to near single-pixel instances under diverse conditions. Quantitative results show that PIC combined with SAM2 outperforms existing annotation techniques in terms of IoU. Furthermore, we benchmark YOLO-based detectors on UAVDB, establishing baselines for future research.