CVLGROApr 30, 2025

Is Intermediate Fusion All You Need for UAV-based Collaborative Perception?

arXiv:2504.21774v21 citationsh-index: 3Has Code
Originality Incremental advance
AI Analysis

This work addresses communication efficiency for UAV-based collaborative perception in intelligent transportation systems, representing an incremental improvement over existing methods.

The paper tackles the problem of high communication overhead in collaborative perception for Unmanned Aerial Vehicles (UAVs) by proposing a late-intermediate fusion framework called LIF, which achieves superior performance with minimal communication bandwidth.

Collaborative perception enhances environmental awareness through inter-agent communication and is regarded as a promising solution to intelligent transportation systems. However, existing collaborative methods for Unmanned Aerial Vehicles (UAVs) overlook the unique characteristics of the UAV perspective, resulting in substantial communication overhead. To address this issue, we propose a novel communication-efficient collaborative perception framework based on late-intermediate fusion, dubbed LIF. The core concept is to exchange informative and compact detection results and shift the fusion stage to the feature representation level. In particular, we leverage vision-guided positional embedding (VPE) and box-based virtual augmented feature (BoBEV) to effectively integrate complementary information from various agents. Additionally, we innovatively introduce an uncertainty-driven communication mechanism that uses uncertainty evaluation to select high-quality and reliable shared areas. Experimental results demonstrate that our LIF achieves superior performance with minimal communication bandwidth, proving its effectiveness and practicality. Code and models are available at https://github.com/uestchjw/LIF.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes