UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping
This work addresses a domain-specific problem for applications like traffic monitoring and delivery services, representing an incremental improvement in multi-UAV collaborative perception.
The paper tackles the challenge of achieving precise and consistent feature mapping from 2D images to 3D space in multi-UAV collaborative 3D object detection by proposing UCDNet, which uses depth information as a prior and a geometric consistency loss, resulting in mAP increases of 4.7% and 10% on two datasets compared to the baseline.
Multi-UAV collaborative 3D object detection can perceive and comprehend complex environments by integrating complementary information, with applications encompassing traffic monitoring, delivery services and agricultural management. However, the extremely broad observations in aerial remote sensing and significant perspective differences across multiple UAVs make it challenging to achieve precise and consistent feature mapping from 2D images to 3D space in multi-UAV collaborative 3D object detection paradigm. To address the problem, we propose an unparalleled camera-based multi-UAV collaborative 3D object detection paradigm called UCDNet. Specifically, the depth information from the UAVs to the ground is explicitly utilized as a strong prior to provide a reference for more accurate and generalizable feature mapping. Additionally, we design a homologous points geometric consistency loss as an auxiliary self-supervision, which directly influences the feature mapping module, thereby strengthening the global consistency of multi-view perception. Experiments on AeroCollab3D and CoPerception-UAVs datasets show our method increases 4.7% and 10% mAP respectively compared to the baseline, which demonstrates the superiority of UCDNet.