CVMar 18, 2020

DeepCap: Monocular Human Performance Capture Using Weak Supervision

Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

arXiv:2003.08325v131.7260 citations

Originality Highly original

AI Analysis

This addresses the problem of expensive multi-view setups for human performance capture in applications like movie production and virtual/augmented reality, offering a more accessible solution.

The paper tackles monocular dense human performance capture by proposing a deep learning approach trained with weak supervision, eliminating the need for 3D ground truth annotations, and it outperforms state-of-the-art methods in quality and robustness.

Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human performance capture. Our method is trained in a weakly supervised manner based on multi-view supervision completely removing the need for training data with 3D ground truth annotations. The network architecture is based on two separate networks that disentangle the task into a pose estimation and a non-rigid surface deformation step. Extensive qualitative and quantitative evaluations show that our approach outperforms the state of the art in terms of quality and robustness.

View on arXiv PDF

Similar