GRCVAug 26, 2025

SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis

arXiv:2508.18597v24 citationsh-index: 46
Originality Highly original
AI Analysis

This work addresses the need for automated indoor scene synthesis for applications in architecture and design, representing an incremental improvement by incorporating architectural constraints.

The paper tackled the problem of generating diverse 3D indoor scenes by introducing SemLayoutDiff, a model that synthesizes semantic layouts and furniture placements conditioned on room masks, resulting in spatially coherent and realistic scenes that outperform prior methods on the 3D-FRONT dataset.

We present SemLayoutDiff, a unified model for synthesizing diverse 3D indoor scenes across multiple room types. The model introduces a scene layout representation combining a top-down semantic map and attributes for each object. Unlike prior approaches, which cannot condition on architectural constraints, SemLayoutDiff employs a categorical diffusion model capable of conditioning scene synthesis explicitly on room masks. It first generates a coherent semantic map, followed by a cross-attention-based network to predict furniture placements that respect the synthesized layout. Our method also accounts for architectural elements such as doors and windows, ensuring that generated furniture arrangements remain practical and unobstructed. Experiments on the 3D-FRONT dataset show that SemLayoutDiff produces spatially coherent, realistic, and varied scenes, outperforming previous methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes