CVAILGROJun 7, 2019

Ego-Pose Estimation and Forecasting as Real-Time PD Control

arXiv:1906.03173v2149 citations
Originality Incremental advance
AI Analysis

This addresses the problem of real-time, physically-valid human motion analysis for applications like VR/AR, though it is incremental as it builds on existing control and RL techniques.

The paper tackles 3D human pose estimation and forecasting from egocentric videos using a PD control-based policy learned via reinforcement learning, achieving state-of-the-art performance in quantitative metrics and visual quality, with real-time operation at 30 FPS.

We propose the use of a proportional-derivative (PD) control based policy learned via reinforcement learning (RL) to estimate and forecast 3D human pose from egocentric videos. The method learns directly from unsegmented egocentric videos and motion capture data consisting of various complex human motions (e.g., crouching, hopping, bending, and motion transitions). We propose a video-conditioned recurrent control technique to forecast physically-valid and stable future motions of arbitrary length. We also introduce a value function based fail-safe mechanism which enables our method to run as a single pass algorithm over the video data. Experiments with both controlled and in-the-wild data show that our approach outperforms previous art in both quantitative metrics and visual quality of the motions, and is also robust enough to transfer directly to real-world scenarios. Additionally, our time analysis shows that the combined use of our pose estimation and forecasting can run at 30 FPS, making it suitable for real-time applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes