CVJul 7, 2020

Long-term Human Motion Prediction with Scene Context

arXiv:2007.03672v3287 citations
AI Analysis

It addresses the problem of inaccurate long-term motion prediction for robotics and human-computer interaction by integrating scene context, though it is incremental as it builds on existing motion prediction methods.

The paper tackles long-term human motion prediction by incorporating scene context, proposing a three-stage framework that samples goals, plans paths, and predicts poses, and shows consistent improvements in synthetic and real datasets.

Human movement is goal-directed and influenced by the spatial layout of the objects in the scene. To plan future human motion, it is crucial to perceive the environment -- imagine how hard it is to navigate a new room with lights off. Existing works on predicting human motion do not pay attention to the scene context and thus struggle in long-term prediction. In this work, we propose a novel three-stage framework that exploits scene context to tackle this task. Given a single scene image and 2D pose histories, our method first samples multiple human motion goals, then plans 3D human paths towards each goal, and finally predicts 3D human pose sequences following each path. For stable training and rigorous evaluation, we contribute a diverse synthetic dataset with clean annotations. In both synthetic and real datasets, our method shows consistent quantitative and qualitative improvements over existing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes