NeuMan: Neural Human Radiance Field from a Single Video
This enables augmented reality experiences by allowing realistic human rendering from minimal input, though it builds on existing geometry estimation methods.
The paper tackles the problem of photorealistic rendering and reposing of humans from a single video, achieving high-quality renderings of humans under novel poses and views, including details like cloth wrinkles, from just a 10-second clip.
Photorealistic rendering and reposing of humans is important for enabling augmented reality experiences. We propose a novel framework to reconstruct the human and the scene that can be rendered with novel human poses and views from just a single in-the-wild video. Given a video captured by a moving camera, we train two NeRF models: a human NeRF model and a scene NeRF model. To train these models, we rely on existing methods to estimate the rough geometry of the human and the scene. Those rough geometry estimates allow us to create a warping field from the observation space to the canonical pose-independent space, where we train the human model in. Our method is able to learn subject specific details, including cloth wrinkles and accessories, from just a 10 seconds video clip, and to provide high quality renderings of the human under novel poses, from novel views, together with the background.