On human motion prediction using recurrent neural networks
This addresses the problem of accurate human motion modeling for applications like virtual reality, but it is incremental as it builds on existing RNN methods.
The paper tackles the problem of human motion prediction using recurrent neural networks (RNNs) and shows that a simple baseline without motion modeling can achieve state-of-the-art performance, leading to a proposed RNN architecture that obtains state-of-the-art results.
Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion, with the goal of learning time-dependent representations that perform tasks such as short-term motion prediction and long-term human motion synthesis. We examine recent work, with a focus on the evaluation methodologies commonly used in the literature, and show that, surprisingly, state-of-the-art performance can be achieved by a simple baseline that does not attempt to model motion at all. We investigate this result, and analyze recent RNN methods by looking at the architectures, loss functions, and training procedures used in state-of-the-art approaches. We propose three changes to the standard RNN models typically used for human motion, which result in a simple and scalable RNN architecture that obtains state-of-the-art performance on human motion prediction.