CVDec 10, 2025

ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation

arXiv:2512.09364v1h-index: 5
Originality Incremental advance
AI Analysis

This work solves the data scarcity issue for researchers and practitioners in 3D computer vision, enabling better generalization in segmenting unseen objects, though it is incremental as it builds on synthetic data generation approaches.

The paper tackles the problem of class-agnostic 3D instance segmentation by addressing data scarcity through a synthetic data generation pipeline, resulting in models that significantly outperform existing methods on benchmarks like ScanNetV2, ScanNet++, and S3DIS.

Class-agnostic 3D instance segmentation tackles the challenging task of segmenting all object instances, including previously unseen ones, without semantic class reliance. Current methods struggle with generalization due to the scarce annotated 3D scene data or noisy 2D segmentations. While synthetic data generation offers a promising solution, existing 3D scene synthesis methods fail to simultaneously satisfy geometry diversity, context complexity, and layout reasonability, each essential for this task. To address these needs, we propose an Adapted 3D Scene Synthesis pipeline for class-agnostic 3D Instance SegmenTation, termed as ASSIST-3D, to synthesize proper data for model generalization enhancement. Specifically, ASSIST-3D features three key innovations, including 1) Heterogeneous Object Selection from extensive 3D CAD asset collections, incorporating randomness in object sampling to maximize geometric and contextual diversity; 2) Scene Layout Generation through LLM-guided spatial reasoning combined with depth-first search for reasonable object placements; and 3) Realistic Point Cloud Construction via multi-view RGB-D image rendering and fusion from the synthetic scenes, closely mimicking real-world sensor data acquisition. Experiments on ScanNetV2, ScanNet++, and S3DIS benchmarks demonstrate that models trained with ASSIST-3D-generated data significantly outperform existing methods. Further comparisons underscore the superiority of our purpose-built pipeline over existing 3D scene synthesis approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes