CVMar 17, 2023

Style Transfer for 2D Talking Head Animation

arXiv:2303.09799v23 citationsh-index: 14
Originality Incremental advance
AI Analysis

It addresses the open problem of learning different talking or singing styles in audio-driven animation, which is incremental as it builds on existing photo-realistic animation methods.

The paper tackles the problem of generating 2D talking head animation with learnable styles, enabling style transfer from reference frames to new images based on audio input, and achieves better qualitative and quantitative results than state-of-the-art methods.

Audio-driven talking head animation is a challenging research topic with many real-world applications. Recent works have focused on creating photo-realistic 2D animation, while learning different talking or singing styles remains an open problem. In this paper, we present a new method to generate talking head animation with learnable style references. Given a set of style reference frames, our framework can reconstruct 2D talking head animation based on a single input image and an audio stream. Our method first produces facial landmarks motion from the audio stream and constructs the intermediate style patterns from the style reference images. We then feed both outputs into a style-aware image generator to generate the photo-realistic and fidelity 2D animation. In practice, our framework can extract the style information of a specific character and transfer it to any new static image for talking head animation. The intensive experimental results show that our method achieves better results than recent state-of-the-art approaches qualitatively and quantitatively.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes