CVMar 16, 2022

Efficient conditioned face animation using frontally-viewed embedding

Maxime Oquab, Daniel Haziza, Ludovic Schwartz, Tao Xu, Katayoun Zand, Rui Wang, Peirong Liu, Camille Couprie

arXiv:2203.08765v12.61 citationsh-index: 23

Originality Highly original

AI Analysis

This work addresses a key bottleneck for ultra-low bandwidth video chat compression by enabling realistic facial animation in real-world conditions, such as on mobile devices like iPhone 8.

The paper tackles the problem of profile view distortions in few-shot facial animation by introducing a multi-frames embedding called Frontalizer, achieving a 22% improvement in perceptual quality and 73% reduction in landmark error over baselines on DFDC videos with head movements.

As the quality of few shot facial animation from landmarks increases, new applications become possible, such as ultra low bandwidth video chat compression with a high degree of realism. However, there are some important challenges to tackle in order to improve the experience in real world conditions. In particular, the current approaches fail to represent profile views without distortions, while running in a low compute regime. We focus on this key problem by introducing a multi-frames embedding dubbed Frontalizer to improve profile views rendering. In addition to this core improvement, we explore the learning of a latent code conditioning generations along with landmarks to better convey facial expressions. Our dense models achieves 22% of improvement in perceptual quality and 73% reduction of landmark error over the first order model baseline on a subset of DFDC videos containing head movements. Declined with mobile architectures, our models outperform the previous state-of-the-art (improving perceptual quality by more than 16% and reducing landmark error by more than 47% on two datasets) while running on real time on iPhone 8 with very low bandwidth requirements.

View on arXiv PDF

Similar