CVNov 13, 2025

HeatV2X: Scalable Heterogeneous Collaborative Perception via Efficient Alignment and Interaction

Yueran Zhao, Zhang Zhang, Chao Sun, Tianze Wang, Chao Yue, Nuoran Li

arXiv:2511.10211v111.83 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses scalability and heterogeneity issues in V2X collaborative perception for autonomous vehicles, representing an incremental improvement with novel adaptation techniques.

The paper tackles the challenges of multi-modal and heterogeneous agents in Vehicle-to-Everything (V2X) collaborative perception by proposing HeatV2X, a scalable framework that achieves superior perception performance with significantly reduced training overhead on datasets like OPV2V-H and DAIR-V2X.

Vehicle-to-Everything (V2X) collaborative perception extends sensing beyond single vehicle limits through transmission. However, as more agents participate, existing frameworks face two key challenges: (1) the participating agents are inherently multi-modal and heterogeneous, and (2) the collaborative framework must be scalable to accommodate new agents. The former requires effective cross-agent feature alignment to mitigate heterogeneity loss, while the latter renders full-parameter training impractical, highlighting the importance of scalable adaptation. To address these issues, we propose Heterogeneous Adaptation (HeatV2X), a scalable collaborative framework. We first train a high-performance agent based on heterogeneous graph attention as the foundation for collaborative learning. Then, we design Local Heterogeneous Fine-Tuning and Global Collaborative Fine-Tuning to achieve effective alignment and interaction among heterogeneous agents. The former efficiently extracts modality-specific differences using Hetero-Aware Adapters, while the latter employs the Multi-Cognitive Adapter to enhance cross-agent collaboration and fully exploit the fusion potential. These designs enable substantial performance improvement of the collaborative framework with minimal training cost. We evaluate our approach on the OPV2V-H and DAIR-V2X datasets. Experimental results demonstrate that our method achieves superior perception performance with significantly reduced training overhead, outperforming existing state-of-the-art approaches. Our implementation will be released soon.

View on arXiv PDF

Similar