SMPLpix: Neural Avatars from 3D Human Models
This work addresses the need for flexible control in human image generation, which is important for applications in computer graphics and virtual reality, though it is incremental by combining existing 3D models with generative networks.
The paper tackled the problem of controlling generative models for human images by bridging geometry-based rendering with deep generative networks, resulting in a method that directly converts 3D mesh vertices into photorealistic images with improved realism and efficiency over conventional renderers.
Recent advances in deep generative models have led to an unprecedented level of realism for synthetically generated images of humans. However, one of the remaining fundamental limitations of these models is the ability to flexibly control the generative process, e.g.~change the camera and human pose while retaining the subject identity. At the same time, deformable human body models like SMPL and its successors provide full control over pose and shape but rely on classic computer graphics pipelines for rendering. Such rendering pipelines require explicit mesh rasterization that (a) does not have the potential to fix artifacts or lack of realism in the original 3D geometry and (b) until recently, were not fully incorporated into deep learning frameworks. In this work, we propose to bridge the gap between classic geometry-based rendering and the latest generative networks operating in pixel space. We train a network that directly converts a sparse set of 3D mesh vertices into photorealistic images, alleviating the need for traditional rasterization mechanism. We train our model on a large corpus of human 3D models and corresponding real photos, and show the advantage over conventional differentiable renderers both in terms of the level of photorealism and rendering efficiency.