CVOct 7, 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

arXiv:2110.03675v1255 citations
Originality Highly original
AI Analysis

This work addresses the need for efficient and flexible scene synthesis tools for applications in interactive 3D design and data generation, offering a novel formulation that enhances usability beyond automatic layout synthesis.

The paper tackles the problem of synthesizing realistic and diverse indoor furniture layouts automatically or from partial inputs, presenting ATISS, an autoregressive transformer architecture that generates rooms as unordered sets of objects, achieving more plausible layouts than existing methods with fewer parameters and up to 8 times faster runtime.

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan. In contrast to prior work, which poses scene synthesis as sequence generation, our model generates rooms as unordered sets of objects. We argue that this formulation is more natural, as it makes ATISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room re-arrangement with any objects specified by the user, as well as object suggestions for any partial room. To enable this, our model leverages the permutation equivariance of the transformer when conditioning on the partial scene, and is trained to be permutation-invariant across object orderings. Our model is trained end-to-end as an autoregressive generative model using only labeled 3D bounding boxes as supervision. Evaluations on four room types in the 3D-FRONT dataset demonstrate that our model consistently generates plausible room layouts that are more realistic than existing methods. In addition, it has fewer parameters, is simpler to implement and train and runs up to 8 times faster than existing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes