ROAICVFeb 15, 2020

3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans

arXiv:2002.06289v2262 citations
AI Analysis

This work addresses the need for improved planning and decision-making in robotics, particularly for human-robot interaction and long-term autonomy, by providing a novel representation and engine, though it is incremental in combining existing detection and tracking techniques.

The paper tackles the problem of creating actionable spatial perception for robotics by introducing 3D Dynamic Scene Graphs (DSGs), a unified representation that integrates entities like objects, humans, and places with spatio-temporal relations, and presents the first fully automatic Spatial Perception Engine (SPIN) to build DSGs from visual-inertial data, demonstrating robustness in simulated environments.

We present a unified representation for actionable spatial perception: 3D Dynamic Scene Graphs. Scene graphs are directed graphs where nodes represent entities in the scene (e.g. objects, walls, rooms), and edges represent relations (e.g. inclusion, adjacency) among nodes. Dynamic scene graphs (DSGs) extend this notion to represent dynamic scenes with moving agents (e.g. humans, robots), and to include actionable information that supports planning and decision-making (e.g. spatio-temporal relations, topology at different levels of abstraction). Our second contribution is to provide the first fully automatic Spatial PerceptIon eNgine(SPIN) to build a DSG from visual-inertial data. We integrate state-of-the-art techniques for object and human detection and pose estimation, and we describe how to robustly infer object, robot, and human nodes in crowded scenes. To the best of our knowledge, this is the first paper that reconciles visual-inertial SLAM and dense human mesh tracking. Moreover, we provide algorithms to obtain hierarchical representations of indoor environments (e.g. places, structures, rooms) and their relations. Our third contribution is to demonstrate the proposed spatial perception engine in a photo-realistic Unity-based simulator, where we assess its robustness and expressiveness. Finally, we discuss the implications of our proposal on modern robotics applications. 3D Dynamic Scene Graphs can have a profound impact on planning and decision-making, human-robot interaction, long-term autonomy, and scene prediction. A video abstract is available at https://youtu.be/SWbofjhyPzI

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes