Dynamic Surface Function Networks for Clothed Human Bodies
This work addresses the challenge of creating detailed, dynamic models of clothed humans for applications in animation and virtual reality, representing an incremental improvement over existing methods.
The paper tackles the problem of reconstructing and tracking clothed humans from monocular RGB-D sequences by learning a person-specific body model using a dynamic surface function network, resulting in temporally coherent mesh sequences and the ability to synthesize new animations with pose-dependent deformations.
We present a novel method for temporal coherent reconstruction and tracking of clothed humans. Given a monocular RGB-D sequence, we learn a person-specific body model which is based on a dynamic surface function network. To this end, we explicitly model the surface of the person using a multi-layer perceptron (MLP) which is embedded into the canonical space of the SMPL body model. With classical forward rendering, the represented surface can be rasterized using the topology of a template mesh. For each surface point of the template mesh, the MLP is evaluated to predict the actual surface location. To handle pose-dependent deformations, the MLP is conditioned on the SMPL pose parameters. We show that this surface representation as well as the pose parameters can be learned in a self-supervised fashion using the principle of analysis-by-synthesis and differentiable rasterization. As a result, we are able to reconstruct a temporally coherent mesh sequence from the input data. The underlying surface representation can be used to synthesize new animations of the reconstructed person including pose-dependent deformations.