DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis
This addresses the problem of generating high-quality talking face videos with accurate hair motion for applications in virtual avatars or entertainment, though it appears incremental as it builds on 3D Gaussian Splatting methods.
The paper tackles the challenge of synthesizing realistic talking face videos for individuals with long hair by proposing DEGSTalk, a method based on 3D Gaussian Splatting that uses decomposed per-embedding Gaussian fields and dynamic hair-preserving rendering, achieving improved realism and synthesis quality compared to existing approaches.
Accurately synthesizing talking face videos and capturing fine facial features for individuals with long hair presents a significant challenge. To tackle these challenges in existing methods, we propose a decomposed per-embedding Gaussian fields (DEGSTalk), a 3D Gaussian Splatting (3DGS)-based talking face synthesis method for generating realistic talking faces with long hairs. Our DEGSTalk employs Deformable Pre-Embedding Gaussian Fields, which dynamically adjust pre-embedding Gaussian primitives using implicit expression coefficients. This enables precise capture of dynamic facial regions and subtle expressions. Additionally, we propose a Dynamic Hair-Preserving Portrait Rendering technique to enhance the realism of long hair motions in the synthesized videos. Results show that DEGSTalk achieves improved realism and synthesis quality compared to existing approaches, particularly in handling complex facial dynamics and hair preservation. Our code will be publicly available at https://github.com/CVI-SZU/DEGSTalk.