CVGRJan 17, 2024

GARField: Group Anything with Radiance Fields

arXiv:2401.09419v1117 citationsh-index: 54CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of grouping at multiple granularities in 3D scenes for applications like 3D asset extraction, though it is incremental as it builds on existing methods like Segment Anything.

The paper tackles the problem of ambiguous scene decomposition by introducing GARField, an approach that decomposes 3D scenes into a hierarchy of semantically meaningful groups from posed images, resulting in multi-view consistent groupings with higher fidelity than input masks.

Grouping is inherently ambiguous due to the multiple levels of granularity in which one can decompose a scene -- should the wheels of an excavator be considered separate or part of the whole? We present Group Anything with Radiance Fields (GARField), an approach for decomposing 3D scenes into a hierarchy of semantically meaningful groups from posed image inputs. To do this we embrace group ambiguity through physical scale: by optimizing a scale-conditioned 3D affinity feature field, a point in the world can belong to different groups of different sizes. We optimize this field from a set of 2D masks provided by Segment Anything (SAM) in a way that respects coarse-to-fine hierarchy, using scale to consistently fuse conflicting masks from different viewpoints. From this field we can derive a hierarchy of possible groupings via automatic tree construction or user interaction. We evaluate GARField on a variety of in-the-wild scenes and find it effectively extracts groups at many levels: clusters of objects, objects, and various subparts. GARField inherently represents multi-view consistent groupings and produces higher fidelity groups than the input SAM masks. GARField's hierarchical grouping could have exciting downstream applications such as 3D asset extraction or dynamic scene understanding. See the project website at https://www.garfield.studio/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes