CVOct 11, 2020

SDMTL: Semi-Decoupled Multi-grained Trajectory Learning for 3D human motion prediction

arXiv:2010.05133v15 citations
Originality Incremental advance
AI Analysis

This work addresses the need for accurate human motion prediction to enable intelligent robots to interact with humans, representing an incremental improvement over existing methods.

The paper tackles the problem of 3D human motion prediction by proposing SDMTL, a network that flexibly captures multi-grained trajectory information, achieving state-of-the-art performance on benchmarks like Human3.6M and CMU-Mocap.

Predicting future human motion is critical for intelligent robots to interact with humans in the real world, and human motion has the nature of multi-granularity. However, most of the existing work either implicitly modeled multi-granularity information via fixed modes or focused on modeling a single granularity, making it hard to well capture this nature for accurate predictions. In contrast, we propose a novel end-to-end network, Semi-Decoupled Multi-grained Trajectory Learning network (SDMTL), to predict future poses, which not only flexibly captures rich multi-grained trajectory information but also aggregates multi-granularity information for predictions. Specifically, we first introduce a Brain-inspired Semi-decoupled Motion-sensitive Encoding module (BSME), effectively capturing spatiotemporal features in a semi-decoupled manner. Then, we capture the temporal dynamics of motion trajectory at multi-granularity, including fine granularity and coarse granularity. We learn multi-grained trajectory information using BSMEs hierarchically and further capture the information of temporal evolutional directions at each granularity by gathering the outputs of BSMEs at each granularity and applying temporal convolutions along the motion trajectory. Next, the captured motion dynamics can be further enhanced by aggregating the information of multi-granularity with a weighted summation scheme. Finally, experimental results on two benchmarks, including Human3.6M and CMU-Mocap, show that our method achieves state-of-the-art performance, demonstrating the effectiveness of our proposed method. The code will be available if the paper is accepted.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes