Predicting People's 3D Poses from Short Sequences
This addresses 3D pose estimation for computer vision applications, but it is incremental as it builds on existing motion-based methods.
The paper tackles the problem of recovering 3D human poses from video by regressing directly from spatio-temporal blocks of frames to the central frame's pose, which improves state-of-the-art results on challenging sequences.
We propose an efficient approach to exploiting motion information from consecutive frames of a video sequence to recover the 3D pose of people. Instead of computing candidate poses in individual frames and then linking them, as is often done, we regress directly from a spatio-temporal block of frames to a 3D pose in the central one. We will demonstrate that this approach allows us to effectively overcome ambiguities and to improve upon the state-of-the-art on challenging sequences.