CVMar 19, 2025

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

arXiv:2503.15019v18 citationsh-index: 27CVPR
Originality Incremental advance
AI Analysis

This work addresses data scarcity issues in 4D scene understanding for computer vision applications, representing an incremental advancement by transferring knowledge from 2D to 4D domains.

The paper tackles the problem of data scarcity and suboptimal performance in 4D Panoptic Scene Graph generation by proposing a framework that leverages rich 2D visual scene annotations, achieving a large margin improvement over baseline models in experiments.

The latest emerged 4D Panoptic Scene Graph (4D-PSG) provides an advanced-ever representation for comprehensively modeling the dynamic 4D visual real world. Unfortunately, current pioneering 4D-PSG research can primarily suffer from data scarcity issues severely, as well as the resulting out-of-vocabulary problems; also, the pipeline nature of the benchmark generation method can lead to suboptimal performance. To address these challenges, this paper investigates a novel framework for 4D-PSG generation that leverages rich 2D visual scene annotations to enhance 4D scene learning. First, we introduce a 4D Large Language Model (4D-LLM) integrated with a 3D mask decoder for end-to-end generation of 4D-PSG. A chained SG inference mechanism is further designed to exploit LLMs' open-vocabulary capabilities to infer accurate and comprehensive object and relation labels iteratively. Most importantly, we propose a 2D-to-4D visual scene transfer learning framework, where a spatial-temporal scene transcending strategy effectively transfers dimension-invariant features from abundant 2D SG annotations to 4D scenes, effectively compensating for data scarcity in 4D-PSG. Extensive experiments on the benchmark data demonstrate that we strikingly outperform baseline models by a large margin, highlighting the effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes