Novel-View Human Action Synthesis
This addresses the problem of generating human actions from new viewpoints for applications in VR/AR or robotics, representing an incremental improvement in the field.
The paper tackles novel-view human action synthesis by developing a 3D reasoning approach that estimates a 3D mesh, transfers textures, and uses a context-based generator to complete appearance information, validated on the NTU RGB+D dataset.
Novel-View Human Action Synthesis aims to synthesize the movement of a body from a virtual viewpoint, given a video from a real viewpoint. We present a novel 3D reasoning to synthesize the target viewpoint. We first estimate the 3D mesh of the target body and transfer the rough textures from the 2D images to the mesh. As this transfer may generate sparse textures on the mesh due to frame resolution or occlusions. We produce a semi-dense textured mesh by propagating the transferred textures both locally, within local geodesic neighborhoods, and globally, across symmetric semantic parts. Next, we introduce a context-based generator to learn how to correct and complete the residual appearance information. This allows the network to independently focus on learning the foreground and background synthesis tasks. We validate the proposed solution on the public NTU RGB+D dataset. The code and resources are available at https://bit.ly/36u3h4K.