CVGRMMOct 21, 2021

MUGL: Large Scale Multi Person Conditional Action Generation with Locomotion

arXiv:2110.11460v120 citations
Originality Incremental advance
AI Analysis

This work addresses the need for practical and controllable large-scale human action generation, though it appears incremental as it builds on existing methods with hybrid representations.

The paper tackles the problem of generating diverse, controllable multi-person action sequences with locomotion, achieving better quality generations than baselines despite being smaller and simpler.

We introduce MUGL, a novel deep neural model for large-scale, diverse generation of single and multi-person pose-based action sequences with locomotion. Our controllable approach enables variable-length generations customizable by action category, across more than 100 categories. To enable intra/inter-category diversity, we model the latent generative space using a Conditional Gaussian Mixture Variational Autoencoder. To enable realistic generation of actions involving locomotion, we decouple local pose and global trajectory components of the action sequence. We incorporate duration-aware feature representations to enable variable-length sequence generation. We use a hybrid pose sequence representation with 3D pose sequences sourced from videos and 3D Kinect-based sequences of NTU-RGBD-120. To enable principled comparison of generation quality, we employ suitably modified strong baselines during evaluation. Although smaller and simpler compared to baselines, MUGL provides better quality generations, paving the way for practical and controllable large-scale human action generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes