CVJul 22, 2020

Contact and Human Dynamics from Monocular Video

arXiv:2007.11678v2128 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of generating accurate and physically valid human animations for applications like character animation and pose estimation, though it is incremental as it builds on existing kinematic methods.

The paper tackles the problem of generating physically plausible 3D human motion from monocular video by addressing errors like foot penetration and extreme angles in existing methods. It introduces a physics-based approach that uses estimated ground contact timings and trajectory optimization, resulting in significantly more realistic motions with improved quantitative measures of kinematic and dynamic plausibility.

Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors that violate physical constraints, such as feet penetrating the ground and bodies leaning at extreme angles. In this paper, we present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input. We first estimate ground contact timings with a novel prediction network which is trained without hand-labeled data. A physics-based trajectory optimization then solves for a physically-plausible motion, based on the inputs. We show this process produces motions that are significantly more realistic than those from purely kinematic methods, substantially improving quantitative measures of both kinematic and dynamic plausibility. We demonstrate our method on character animation and pose estimation tasks on dynamic motions of dancing and sports with complex contact patterns.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes