CVSep 27, 2024

Search3D: Hierarchical Open-Vocabulary 3D Segmentation

arXiv:2409.18431v232 citationsh-index: 15
Originality Highly original
AI Analysis

This addresses the limitation of existing methods that struggle with finer-grained entities in 3D scenes, offering a more flexible search paradigm for applications in robotics or AR/VR.

The paper tackles the problem of open-vocabulary 3D segmentation by introducing Search3D, which enables hierarchical scene representations for fine-grained parts, objects, and material-based regions, outperforming baselines in part segmentation while maintaining strong performance on objects and materials.

Open-vocabulary 3D segmentation enables exploration of 3D spaces using free-form text descriptions. Existing methods for open-vocabulary 3D instance segmentation primarily focus on identifying object-level instances but struggle with finer-grained scene entities such as object parts, or regions described by generic attributes. In this work, we introduce Search3D, an approach to construct hierarchical open-vocabulary 3D scene representations, enabling 3D search at multiple levels of granularity: fine-grained object parts, entire objects, or regions described by attributes like materials. Unlike prior methods, Search3D shifts towards a more flexible open-vocabulary 3D search paradigm, moving beyond explicit object-centric queries. For systematic evaluation, we further contribute a scene-scale open-vocabulary 3D part segmentation benchmark based on MultiScan, along with a set of open-vocabulary fine-grained part annotations on ScanNet++. Search3D outperforms baselines in scene-scale open-vocabulary 3D part segmentation, while maintaining strong performance in segmenting 3D objects and materials. Our project page is http://search3d-segmentation.github.io.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes