CVAISep 21, 2023

2DDATA: 2D Detection Annotations Transmittable Aggregation for Semantic Segmentation on Point Cloud

arXiv:2309.11755v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of expensive multi-modality dataset creation for researchers and practitioners in autonomous driving or robotics, though it appears incremental as it builds on prior fusion works.

The paper tackles the high cost and complexity of multi-modality data collection for LiDAR-camera fusion by introducing 2DDATA, a method that uses easily acquired 2D bounding box annotations to transmit prior information to 3D encoders, demonstrating feasibility without requiring precise calibration.

Recently, multi-modality models have been introduced because of the complementary information from different sensors such as LiDAR and cameras. It requires paired data along with precise calibrations for all modalities, the complicated calibration among modalities hugely increases the cost of collecting such high-quality datasets, and hinder it from being applied to practical scenarios. Inherit from the previous works, we not only fuse the information from multi-modality without above issues, and also exhaust the information in the RGB modality. We introduced the 2D Detection Annotations Transmittable Aggregation(\textbf{2DDATA}), designing a data-specific branch, called \textbf{Local Object Branch}, which aims to deal with points in a certain bounding box, because of its easiness of acquiring 2D bounding box annotations. We demonstrate that our simple design can transmit bounding box prior information to the 3D encoder model, proving the feasibility of large multi-modality models fused with modality-specific data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes