3D Spatial Recognition without Spatially Labeled 3D
This addresses the problem of reducing annotation costs for 3D spatial recognition in domains like robotics and autonomous driving, offering a novel weakly-supervised approach.
The paper tackles 3D recognition tasks like segmentation and object detection without spatially labeled 3D data, using only scene-level class tags, and achieves over 6% mIoU improvement on weakly-supervised segmentation and sets new benchmarks for detection.
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision. WyPR jointly addresses three core 3D recognition tasks: point-level semantic segmentation, 3D proposal generation, and 3D object detection, coupling their predictions through self and cross-task consistency losses. We show that in conjunction with standard multiple-instance learning objectives, WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time. We demonstrate its efficacy using the ScanNet and S3DIS datasets, outperforming prior state of the art on weakly-supervised segmentation by more than 6% mIoU. In addition, we set up the first benchmark for weakly-supervised 3D object detection on both datasets, where WyPR outperforms standard approaches and establishes strong baselines for future work.