CV GR MMMar 15, 2022

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu

arXiv:2203.07706v226.6104 citationsh-index: 58

Originality Incremental advance

AI Analysis

This work addresses the challenge of general action-conditioned 3D human motion generation for applications in animation and robotics, representing an incremental advancement by combining existing techniques like Transformer and GAN with a new dataset.

The paper tackled the problem of generating 3D human motions conditioned on actions, including single-person and multi-person interactive actions, by proposing ActFormer, a GAN-based Transformer method, and introduced a new synthetic dataset for multi-person combat behaviors; it achieved superior performance over state-of-the-art methods on multiple datasets.

We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions. Our approach consists of a powerful Action-conditioned motion TransFormer (ActFormer) under a GAN training scheme, equipped with a Gaussian Process latent prior. Such a design combines the strong spatio-temporal representation capacity of Transformer, superiority in generative modeling of GAN, and inherent temporal correlations from the latent prior. Furthermore, ActFormer can be naturally extended to multi-person motions by alternately modeling temporal correlations and human interactions with Transformer encoders. To further facilitate research on multi-person motion generation, we introduce a new synthetic dataset of complex multi-person combat behaviors. Extensive experiments on NTU-13, NTU RGB+D 120, BABEL and the proposed combat dataset show that our method can adapt to various human motion representations and achieve superior performance over the state-of-the-art methods on both single-person and multi-person motion generation tasks, demonstrating a promising step towards a general human motion generator.

View on arXiv PDF

Similar