InstantAvatar: Efficient 3D Head Reconstruction via Surface Rendering
This enables efficient creation of 3D head avatars for applications like VR/AR, though it is incremental as it builds on existing neural field and rendering techniques.
The authors tackled the problem of slow 3D head reconstruction from images by introducing InstantAvatar, which achieves comparable accuracy to state-of-the-art methods with a 100x speed-up, reconstructing avatars in seconds from as few as one image.
Recent advances in full-head reconstruction have been obtained by optimizing a neural field through differentiable surface or volume rendering to represent a single scene. While these techniques achieve an unprecedented accuracy, they take several minutes, or even hours, due to the expensive optimization process required. In this work, we introduce InstantAvatar, a method that recovers full-head avatars from few images (down to just one) in a few seconds on commodity hardware. In order to speed up the reconstruction process, we propose a system that combines, for the first time, a voxel-grid neural field representation with a surface renderer. Notably, a naive combination of these two techniques leads to unstable optimizations that do not converge to valid solutions. In order to overcome this limitation, we present a novel statistical model that learns a prior distribution over 3D head signed distance functions using a voxel-grid based architecture. The use of this prior model, in combination with other design choices, results into a system that achieves 3D head reconstructions with comparable accuracy as the state-of-the-art with a 100x speed-up.