CVAIJun 17, 2024

NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation

arXiv:2406.11259v1
Originality Incremental advance
AI Analysis

This addresses the efficiency bottleneck for real-time 3D talking head generation, though it is an incremental improvement over existing NeRF methods.

The paper tackles the slow rendering speed of NeRF-based talking head generation by proposing Neural Light Dynamic Fields (NLDF), which uses light segments and knowledge distillation to achieve about 30 times faster speed while maintaining comparable visual quality.

Talking head generation based on the neural radiation fields model has shown promising visual effects. However, the slow rendering speed of NeRF seriously limits its application, due to the burdensome calculation process over hundreds of sampled points to synthesize one pixel. In this work, a novel Neural Light Dynamic Fields model is proposed aiming to achieve generating high quality 3D talking face with significant speedup. The NLDF represents light fields based on light segments, and a deep network is used to learn the entire light beam's information at once. In learning the knowledge distillation is applied and the NeRF based synthesized result is used to guide the correct coloration of light segments in NLDF. Furthermore, a novel active pool training strategy is proposed to focus on high frequency movements, particularly on the speaker mouth and eyebrows. The propose method effectively represents the facial light dynamics in 3D talking video generation, and it achieves approximately 30 times faster speed compared to state of the art NeRF based method, with comparable generation visual quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes