CV GRApr 22, 2022

Leveraging Deepfakes to Close the Domain Gap between Real and Synthetic Images in Facial Capture Pipelines

Winnie Lin, Yilin Zhu, Demi Guo, Ron Fedkiw

arXiv:2204.10746v25.75 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses the challenge of facial capture for applications like animation and VR by enabling robust tracking from low-quality video, though it is incremental in leveraging existing deepfake methods.

The paper tackles the problem of building and tracking 3D facial models from in-the-wild video data by using deepfake technology to bridge the synthetic-to-real domain gap, resulting in a pipeline that avoids the need for real-world ground truth or high-end capture setups.

We propose an end-to-end pipeline for both building and tracking 3D facial models from personalized in-the-wild (cellphone, webcam, youtube clips, etc.) video data. First, we present a method for automatic data curation and retrieval based on a hierarchical clustering framework typical of collision detection algorithms in traditional computer graphics pipelines. Subsequently, we utilize synthetic turntables and leverage deepfake technology in order to build a synthetic multi-view stereo pipeline for appearance capture that is robust to imperfect synthetic geometry and image misalignment. The resulting model is fit with an animation rig, which is then used to track facial performances. Notably, our novel use of deepfake technology enables us to perform robust tracking of in-the-wild data using differentiable renderers despite a significant synthetic-to-real domain gap. Finally, we outline how we train a motion capture regressor, leveraging the aforementioned techniques to avoid the need for real-world ground truth data and/or a high-end calibrated camera capture setup.

View on arXiv PDF

Similar