CVApr 10, 2025

WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer

arXiv:2504.07441v12 citationsh-index: 8SMC
Originality Incremental advance
AI Analysis

This work addresses robust object detection for Unmanned Surface Vehicles in water environments, which is incremental as it builds on existing vision-radar fusion methods.

The paper tackles robust object detection for Unmanned Surface Vehicles in complex water environments by proposing WS-DETR, a vision-radar fusion model that addresses cross-modal feature conflicts and achieves state-of-the-art performance on the WaterScenes dataset, maintaining superiority under adverse conditions.

Robust object detection for Unmanned Surface Vehicles (USVs) in complex water environments is essential for reliable navigation and operation. Specifically, water surface object detection faces challenges from blurred edges and diverse object scales. Although vision-radar fusion offers a feasible solution, existing approaches suffer from cross-modal feature conflicts, which negatively affect model robustness. To address this problem, we propose a robust vision-radar fusion model WS-DETR. In particular, we first introduce a Multi-Scale Edge Information Integration (MSEII) module to enhance edge perception and a Hierarchical Feature Aggregator (HiFA) to boost multi-scale object detection in the encoder. Then, we adopt self-moving point representations for continuous convolution and residual connection to efficiently extract irregular features under the scenarios of irregular point cloud data. To further mitigate cross-modal conflicts, an Adaptive Feature Interactive Fusion (AFIF) module is introduced to integrate visual and radar features through geometric alignment and semantic fusion. Extensive experiments on the WaterScenes dataset demonstrate that WS-DETR achieves state-of-the-art (SOTA) performance, maintaining its superiority even under adverse weather and lighting conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes