GRCVMMApr 17, 2025

HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation

arXiv:2504.13072v114 citationsh-index: 11MM
Originality Highly original
AI Analysis

This addresses scene-level 3D generation for multimedia and computer graphics applications, representing an incremental advance through a novel hierarchical approach.

The paper tackles the problem of generating 3D scenes with limited object categories and editing flexibility by introducing HiScene, a hierarchical framework that creates high-fidelity scenes with compositional identities and aesthetic content, producing more natural object arrangements and complete object instances suitable for interactive applications.

Scene-level 3D generation represents a critical frontier in multimedia and computer graphics, yet existing approaches either suffer from limited object categories or lack editing flexibility for interactive applications. In this paper, we present HiScene, a novel hierarchical framework that bridges the gap between 2D image generation and 3D object generation and delivers high-fidelity scenes with compositional identities and aesthetic scene content. Our key insight is treating scenes as hierarchical "objects" under isometric views, where a room functions as a complex object that can be further decomposed into manipulatable items. This hierarchical approach enables us to generate 3D content that aligns with 2D representations while maintaining compositional structure. To ensure completeness and spatial alignment of each decomposed instance, we develop a video-diffusion-based amodal completion technique that effectively handles occlusions and shadows between objects, and introduce shape prior injection to ensure spatial coherence within the scene. Experimental results demonstrate that our method produces more natural object arrangements and complete object instances suitable for interactive applications, while maintaining physical plausibility and alignment with user inputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes