Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image
This addresses the challenge of creating realistic 3D talking avatars from minimal input, with potential applications in virtual reality and gaming, though it appears incremental by building on existing parametric models and 3D GAN priors.
The paper tackles the problem of building 3D animatable head avatars from a single image, which often fail under large camera pose variations, by proposing a framework that achieves high-quality 3D full-head modeling and real-time animation in a single feed-forward pass.
Building 3D animatable head avatars from a single image is an important yet challenging problem. Existing methods generally collapse under large camera pose variations, compromising the realism of 3D avatars. In this work, we propose a new framework to tackle the novel setting of one-shot 3D full-head animatable avatar reconstruction in a single feed-forward pass, enabling real-time animation and simultaneous 360$^\circ$ rendering views. To facilitate efficient animation control, we model 3D head avatars with Gaussian primitives embedded on the surface of a parametric face model within the UV space. To obtain knowledge of full-head geometry and textures, we leverage rich 3D full-head priors within a pretrained 3D generative adversarial network (GAN) for global full-head feature extraction and multi-view supervision. To increase the fidelity of the 3D reconstruction of the input image, we take advantage of the symmetric nature of the UV space and human faces to fuse local fine-grained input image features with the global full-head textures. Extensive experiments demonstrate the effectiveness of our method, achieving high-quality 3D full-head modeling as well as real-time animation, thereby improving the realism of 3D talking avatars.