CVROJul 23, 2019

Not Only Look But Observe: Variational Observation Model of Scene-Level 3D Multi-Object Understanding for Probabilistic SLAM

arXiv:1907.09760v32 citations
Originality Incremental advance
AI Analysis

This work addresses scene-level 3D understanding for robotics and autonomous systems, offering a novel approach but with incremental improvements over existing probabilistic methods.

The paper tackles the problem of 3D multi-object understanding from a single 2D image in probabilistic SLAM, where previous methods focused on single objects and ignored scene-level relations. It proposes NOLBO, a variational observation model that estimates latent variables for full shape and pose, enabling object-oriented data association and integration into probabilistic inference.

We present NOLBO, a variational observation model estimation for 3D multi-object from 2D single shot. Previous probabilistic instance-level understandings mainly consider the single-object image, not single shot with multi-object; relations between objects and the entire scene are out of their focus. The objectness of each observation also hardly join their model. Therefore, we propose a method to approximate the Bayesian observation model of scene-level 3D multi-object understanding. By exploiting variational auto-encoder (VAE), we estimate latent variables from the entire scene, which follow tractable distributions and concurrently imply 3D full shape and pose. To perform object-oriented data association and probabilistic simultaneous localization and mapping (SLAM), our observation models can easily be adopted to probabilistic inference by replacing object-oriented features with latent variables.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes