CVJul 2, 2020

RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

arXiv:2007.01272v259 citations
AI Analysis

This work addresses the challenge of realistic multi-object scene synthesis for applications in computer vision and graphics, representing an incremental improvement with novel method elements.

The authors tackled the problem of generating physically plausible scenes and videos of multiple interacting objects by introducing RELATE, a model that combines object-centric GANs with explicit correlation modeling, resulting in significant outperformance over prior art in synthetic and real-world datasets.

We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects. Similar to other generative approaches, RELATE is trained end-to-end on raw, unlabeled data. RELATE combines an object-centric GAN formulation with a model that explicitly accounts for correlations between individual objects. This allows the model to generate realistic scenes and videos from a physically-interpretable parameterization. Furthermore, we show that modeling the object correlation is necessary to learn to disentangle object positions and identity. We find that RELATE is also amenable to physically realistic scene editing and that it significantly outperforms prior art in object-centric scene generation in both synthetic (CLEVR, ShapeStacks) and real-world data (cars). In addition, in contrast to state-of-the-art methods in object-centric generative modeling, RELATE also extends naturally to dynamic scenes and generates videos of high visual fidelity. Source code, datasets and more results are available at http://geometry.cs.ucl.ac.uk/projects/2020/relate/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes