CVLGJul 24, 2020

Unsupervised Discovery of 3D Physical Objects from Video

arXiv:2007.12348v341 citations
AI Analysis

This addresses the problem of unsupervised 3D object segmentation for computer vision and robotics, representing a novel method for a known bottleneck rather than an incremental improvement.

The paper tackles unsupervised discovery of 3D physical objects from video by leveraging physics and object interactions, resulting in a model that reliably segments objects in synthetic and real scenes and infers object properties for reasoning about physical events.

We study the problem of unsupervised physical object discovery. While existing frameworks aim to decompose scenes into 2D segments based off each object's appearance, we explore how physics, especially object interactions, facilitates disentangling of 3D geometry and position of objects from video, in an unsupervised manner. Drawing inspiration from developmental psychology, our Physical Object Discovery Network (POD-Net) uses both multi-scale pixel cues and physical motion cues to accurately segment observable and partially occluded objects of varying sizes, and infer properties of those objects. Our model reliably segments objects on both synthetic and real scenes. The discovered object properties can also be used to reason about physical events.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes