CVDec 12, 2023

MWSIS: Multimodal Weakly Supervised Instance Segmentation with 2D Box Annotations for Autonomous Driving

arXiv:2312.06988v411 citationsh-index: 8Has CodeAAAI
Originality Highly original
AI Analysis

This work addresses the high cost of manual mask annotation for instance segmentation in autonomous driving, offering a weakly supervised approach that is incremental but reduces annotation needs significantly.

The paper tackles the problem of reducing annotation costs for instance segmentation in autonomous driving by proposing MWSIS, a framework that uses only 2D box annotations to simultaneously segment 2D and 3D instances, achieving improvements of 2.59% mAP for 2D and 12.75% mAP for 3D on the Waymo dataset.

Instance segmentation is a fundamental research in computer vision, especially in autonomous driving. However, manual mask annotation for instance segmentation is quite time-consuming and costly. To address this problem, some prior works attempt to apply weakly supervised manner by exploring 2D or 3D boxes. However, no one has ever successfully segmented 2D and 3D instances simultaneously by only using 2D box annotations, which could further reduce the annotation cost by an order of magnitude. Thus, we propose a novel framework called Multimodal Weakly Supervised Instance Segmentation (MWSIS), which incorporates various fine-grained label generation and correction modules for both 2D and 3D modalities to improve the quality of pseudo labels, along with a new multimodal cross-supervision approach, named Consistency Sparse Cross-modal Supervision (CSCS), to reduce the inconsistency of multimodal predictions by response distillation. Particularly, transferring the 3D backbone to downstream tasks not only improves the performance of the 3D detectors, but also outperforms fully supervised instance segmentation with only 5% fully supervised annotations. On the Waymo dataset, the proposed framework demonstrates significant improvements over the baseline, especially achieving 2.59% mAP and 12.75% mAP increases for 2D and 3D instance segmentation tasks, respectively. The code is available at https://github.com/jiangxb98/mwsis-plugin.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes