CV AIAug 20, 2024

MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation

Jintao Cheng, Xingming Chen, Jinxin Liang, Xiaoyu Tang, Xieyuanli Chen, Dachuan Li

arXiv:2408.10602v12.02 citationsh-index: 7Has Code

Originality Incremental advance

AI Analysis

This work addresses moving object segmentation for autonomous driving and robotics, presenting an incremental improvement through multi-view feature fusion.

The paper tackles the challenge of effectively utilizing motion and semantic features for 3D moving object segmentation in autonomous driving and robotics by proposing MV-MOS, a multi-view model that fuses features from bird's eye view and range view representations, achieving state-of-the-art performance on the SemanticKITTI benchmark.

Effectively summarizing dense 3D point cloud data and extracting motion information of moving objects (moving object segmentation, MOS) is crucial to autonomous driving and robotics applications. How to effectively utilize motion and semantic features and avoid information loss during 3D-to-2D projection is still a key challenge. In this paper, we propose a novel multi-view MOS model (MV-MOS) by fusing motion-semantic features from different 2D representations of point clouds. To effectively exploit complementary information, the motion branches of the proposed model combines motion features from both bird's eye view (BEV) and range view (RV) representations. In addition, a semantic branch is introduced to provide supplementary semantic features of moving objects. Finally, a Mamba module is utilized to fuse the semantic features with motion features and provide effective guidance for the motion branches. We validated the effectiveness of the proposed multi-branch fusion MOS framework via comprehensive experiments, and our proposed model outperforms existing state-of-the-art models on the SemanticKITTI benchmark.

View on arXiv PDF Code

Similar