Hierarchical Graph-Convolutional Variational AutoEncoding for Generative Modelling of Human Motion
This work addresses the challenge of unified generative modeling for human motion, which is incremental as it combines existing techniques like hierarchical VAEs and graph CNNs for better performance on specific tasks.
The authors tackled the problem of modeling human motion for both trajectory prediction and action classification by proposing a hierarchical graph-convolutional variational autoencoder (HG-VAE), which generates coherent actions, detects out-of-distribution data, and imputes missing data, showing improved downstream discriminative learning on H3.6M and AMASS datasets.
Models of human motion commonly focus either on trajectory prediction or action classification but rarely both. The marked heterogeneity and intricate compositionality of human motion render each task vulnerable to the data degradation and distributional shift common to real-world scenarios. A sufficiently expressive generative model of action could in theory enable data conditioning and distributional resilience within a unified framework applicable to both tasks. Here we propose a novel architecture based on hierarchical variational autoencoders and deep graph convolutional neural networks for generating a holistic model of action over multiple time-scales. We show this Hierarchical Graph-convolutional Variational Autoencoder (HG-VAE) to be capable of generating coherent actions, detecting out-of-distribution data, and imputing missing data by gradient ascent on the model's posterior. Trained and evaluated on H3.6M and the largest collection of open source human motion data, AMASS, we show HG-VAE can facilitate downstream discriminative learning better than baseline models.