Scan2Part: Fine-grained and Hierarchical Part-level Understanding of Real-World 3D Scans
This work addresses part-level understanding in real-world 3D scans for applications in robotics and augmented reality, but it is incremental as it builds on existing segmentation and dataset creation approaches.
The authors tackled the problem of segmenting individual parts of objects in noisy indoor 3D scans by introducing Scan2Part, a method that predicts fine-grained per-object part labels, achieving results on a new dataset with 242,081 correspondences across 1,506 scenes.
We propose Scan2Part, a method to segment individual parts of objects in real-world, noisy indoor RGB-D scans. To this end, we vary the part hierarchies of objects in indoor scenes and explore their effect on scene understanding models. Specifically, we use a sparse U-Net-based architecture that captures the fine-scale detail of the underlying 3D scan geometry by leveraging a multi-scale feature hierarchy. In order to train our method, we introduce the Scan2Part dataset, which is the first large-scale collection providing detailed semantic labels at the part level in the real-world setting. In total, we provide 242,081 correspondences between 53,618 PartNet parts of 2,477 ShapeNet objects and 1,506 ScanNet scenes, at two spatial resolutions of 2 cm$^3$ and 5 cm$^3$. As output, we are able to predict fine-grained per-object part labels, even when the geometry is coarse or partially missing.