Learning Body Shape and Pose from Dense Correspondences
This addresses the challenge of 3D human modeling for computer vision applications without relying on costly 3D annotations, though it appears incremental by building on existing correspondence-based methods.
The paper tackles the problem of learning 3D human pose and body shape from 2D images without using 3D datasets, achieving this by leveraging dense correspondences annotated on in-the-wild images and a deform-and-learn training strategy.
In this paper, we address the problem of learning 3D human pose and body shape from 2D image dataset, without having to use 3D dataset (body shape and pose). The idea is to use dense correspondences between image points and a body surface, which can be annotated on in-the wild 2D images, and extract and aggregate 3D information from them. To do so, we propose a training strategy called ``deform-and-learn" where we alternate deformable surface registration and training of deep convolutional neural networks (ConvNets). Unlike previous approaches, our method does not require 3D pose annotations from a motion capture (MoCap) system or human intervention to validate 3D pose annotations.