CVMay 21, 2025

DC-Scene: Data-Centric Learning for 3D Scene Understanding

arXiv:2505.15232v19 citationsh-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency and data scarcity issues in 3D scene understanding for applications like robotics and autonomous driving, representing an incremental improvement through a novel data filtering method.

The paper tackles the challenges of high computational costs and scarce annotated data in 3D scene understanding by proposing DC-Scene, a data-centric framework that uses a quality filter and curriculum scheduler to achieve state-of-the-art performance (86.1 CIDEr) while reducing training cost by about two-thirds.

3D scene understanding plays a fundamental role in vision applications such as robotics, autonomous driving, and augmented reality. However, advancing learning-based 3D scene understanding remains challenging due to two key limitations: (1) the large scale and complexity of 3D scenes lead to higher computational costs and slower training compared to 2D counterparts; and (2) high-quality annotated 3D datasets are significantly scarcer than those available for 2D vision. These challenges underscore the need for more efficient learning paradigms. In this work, we propose DC-Scene, a data-centric framework tailored for 3D scene understanding, which emphasizes enhancing data quality and training efficiency. Specifically, we introduce a CLIP-driven dual-indicator quality (DIQ) filter, combining vision-language alignment scores with caption-loss perplexity, along with a curriculum scheduler that progressively expands the training pool from the top 25% to 75% of scene-caption pairs. This strategy filters out noisy samples and significantly reduces dependence on large-scale labeled 3D data. Extensive experiments on ScanRefer and Nr3D demonstrate that DC-Scene achieves state-of-the-art performance (86.1 CIDEr with the top-75% subset vs. 85.4 with the full dataset) while reducing training cost by approximately two-thirds, confirming that a compact set of high-quality samples can outperform exhaustive training. Code will be available at https://github.com/AIGeeksGroup/DC-Scene.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes