PersonNeRF: Personalized Reconstruction from Photo Collections
This addresses the challenge of free-viewpoint human rendering for applications like virtual reality or digital avatars, though it is incremental as it builds on existing NeRF methods.
The paper tackles the problem of reconstructing a personalized 3D model from sparse photo collections of a subject across varying poses and appearances, enabling novel rendering of arbitrary viewpoints, poses, and appearances with compelling results that outperform prior work.
We present PersonNeRF, a method that takes a collection of photos of a subject (e.g. Roger Federer) captured across multiple years with arbitrary body poses and appearances, and enables rendering the subject with arbitrary novel combinations of viewpoint, body pose, and appearance. PersonNeRF builds a customized neural volumetric 3D model of the subject that is able to render an entire space spanned by camera viewpoint, body pose, and appearance. A central challenge in this task is dealing with sparse observations; a given body pose is likely only observed by a single viewpoint with a single appearance, and a given appearance is only observed under a handful of different body poses. We address this issue by recovering a canonical T-pose neural volumetric representation of the subject that allows for changing appearance across different observations, but uses a shared pose-dependent motion field across all observations. We demonstrate that this approach, along with regularization of the recovered volumetric geometry to encourage smoothness, is able to recover a model that renders compelling images from novel combinations of viewpoint, pose, and appearance from these challenging unstructured photo collections, outperforming prior work for free-viewpoint human rendering.