Long Term Motion Prediction Using Keyposes
This work is significant for safety-critical applications like human-robot interaction and autonomous driving, where longer-term and more realistic human motion predictions are crucial.
This paper addresses long-term human motion prediction by proposing a method that predicts a few keyposes and interpolates intermediate ones, rather than predicting every pose. This approach enables realistic motion forecasting for up to 5 seconds, significantly extending the typical 1-second prediction horizon found in literature.
Long term human motion prediction is essential in safety-critical applications such as human-robot interaction and autonomous driving. In this paper we show that to achieve long term forecasting, predicting human pose at every time instant is unnecessary. Instead, it is more effective to predict a few keyposes and approximate intermediate ones by interpolating the keyposes. We demonstrate that our approach enables us to predict realistic motions for up to 5 seconds in the future, which is far longer than the typical 1 second encountered in the literature. Furthermore, because we model future keyposes probabilistically, we can generate multiple plausible future motions by sampling at inference time. Over this extended time period, our predictions are more realistic, more diverse and better preserve the motion dynamics than those state-of-the-art methods yield.