CVMay 18

Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation

arXiv:2605.1850763.4Has Code
Predicted impact top 53% in CV · last 90 daysOriginality Incremental advance
AI Analysis

It addresses the challenge of obtaining ground-truth data for radar scene flow by introducing a practical weakly supervised approach, reducing reliance on expensive sensors like LiDAR.

The paper proposes a weakly supervised framework for 4D radar scene flow estimation using only images and odometry during training, avoiding costly LiDAR supervision. The method achieves state-of-the-art performance on the VoD dataset, outperforming both cross-modal supervised and fully supervised methods.

Due to the difficulty of obtaining ground-truth data for 4D radar scene flow estimation, previous methods typically rely on either self-supervised losses or cross-modal supervision using 3D LiDAR data, 2D images, and odometry. However, self-supervised approaches often yield suboptimal results due to radar's inherently low-fidelity measurements, while existing cross-modal supervised methods introduce complex multi-task architecture and require costly LiDAR sensors to generate pseudo radar scene flow labels from pretrained 3D tracking models. To overcome these limitations, we propose a task-specific iterative framework for weakly supervised radar scene flow learning, using only images and odometry for auxiliary supervision during training. Specially, we establish two novel instance-aware self-supervised losses by exploiting off-the-shelf 2D tracking and segmentation algorithms to obtain tracked instance masks, which are back-projected into 3D space to provide instance-level semantic guidance; for static regions, we integrate vehicle odometry with radar's intrinsic motion cues to construct a rigid static loss. Extensive experiments on the real-world View-of-Delft (VoD) dataset demonstrate that our method not only surpasses state-of-the-art cross-modal supervised approaches that rely on 3D multi-object tracking on dense LiDAR point clouds but also outperforms existing fully supervised scene flow estimation methods. The code is open-sourced at \href{https://github.com/FuJingyun/IterFlow}{https://github.com/FuJingyun/IterFlow}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes