LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
This work addresses the problem of costly semantic annotation for researchers and practitioners in computer vision, offering an incremental improvement through automation of labeling processes.
The authors tackled the high cost of acquiring semantic annotations for perception models by introducing a fully automated 2D/3D labeling framework that generates labels for RGB-D scans with accuracy equal to or better than manually annotated datasets like ScanNet, achieving significantly better labels for ScanNet and automatically labeling the ARKitScenes dataset.
Semantic annotations are indispensable to train or evaluate perception models, yet very costly to acquire. This work introduces a fully automated 2D/3D labeling framework that, without any human intervention, can generate labels for RGB-D scans at equal (or better) level of accuracy than comparable manually annotated datasets such as ScanNet. Our approach is based on an ensemble of state-of-the-art segmentation models and 3D lifting through neural rendering. We demonstrate the effectiveness of our LabelMaker pipeline by generating significantly better labels for the ScanNet datasets and automatically labelling the previously unlabeled ARKitScenes dataset. Code and models are available at https://labelmaker.org