UV Volumes for Real-time Rendering of Editable Free-view Human Performance
This addresses a critical bottleneck for immersive VR/AR applications by making editable human rendering more efficient, though it is an incremental improvement over existing neural rendering techniques.
The paper tackles the high computational cost of neural volume rendering for free-view human performance by proposing UV Volumes, which separates high-frequency appearance into 2D neural texture stacks, enabling real-time rendering at 30FPS for 960x540 images with comparable photo-realism to state-of-the-art methods.
Neural volume rendering enables photo-realistic renderings of a human performer in free-view, a critical task in immersive VR/AR applications. But the practice is severely limited by high computational costs in the rendering process. To solve this problem, we propose the UV Volumes, a new approach that can render an editable free-view video of a human performer in real-time. It separates the high-frequency (i.e., non-smooth) human appearance from the 3D volume, and encodes them into 2D neural texture stacks (NTS). The smooth UV volumes allow much smaller and shallower neural networks to obtain densities and texture coordinates in 3D while capturing detailed appearance in 2D NTS. For editability, the mapping between the parameterized human model and the smooth texture coordinates allows us a better generalization on novel poses and shapes. Furthermore, the use of NTS enables interesting applications, e.g., retexturing. Extensive experiments on CMU Panoptic, ZJU Mocap, and H36M datasets show that our model can render 960 x 540 images in 30FPS on average with comparable photo-realism to state-of-the-art methods. The project and supplementary materials are available at https://fanegg.github.io/UV-Volumes.