LGCVMLDec 18, 2019

Contextually Plausible and Diverse 3D Human Motion Prediction

arXiv:1912.08521v419 citations
Originality Incremental advance
AI Analysis

This work improves motion prediction for applications like animation and robotics, though it is incremental as it builds on existing CVAE approaches.

The paper tackles the problem of generating diverse and plausible 3D human motion predictions from observed poses, addressing issues in existing CVAE-based methods that often produce unrealistic or insufficiently varied outputs. It introduces a new variational framework that conditions latent sampling on past observations, resulting in higher-quality motions that retain diversity and preserve contextual information.

We tackle the task of diverse 3D human motion prediction, that is, forecasting multiple plausible future 3D poses given a sequence of observed 3D poses. In this context, a popular approach consists of using a Conditional Variational Autoencoder (CVAE). However, existing approaches that do so either fail to capture the diversity in human motion, or generate diverse but semantically implausible continuations of the observed motion. In this paper, we address both of these problems by developing a new variational framework that accounts for both diversity and context of the generated future motion. To this end, and in contrast to existing approaches, we condition the sampling of the latent variable that acts as source of diversity on the representation of the past observation, thus encouraging it to carry relevant information. Our experiments demonstrate that our approach yields motions not only of higher quality while retaining diversity, but also that preserve the contextual information contained in the observed 3D pose sequence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes