High-Quality Real Time Facial Capture Based on Single Camera
This addresses the problem of automating facial animation for digital doubles in narrative-driven media, though it appears incremental as it builds on existing pipelines like FACEGOOD.
The authors tackled real-time facial expression capture from video by training a convolutional neural network to produce high-quality blendshape weights, reducing labor in video game and film production.
We propose a real time deep learning framework for video-based facial expression capture. Our process uses a high-end facial capture pipeline based on FACEGOOD to capture facial expression. We train a convolutional neural network to produce high-quality continuous blendshape weight output from video training. Since this facial capture is fully automated, our system can drastically reduce the amount of labor involved in the development of modern narrative-driven video games or films involving realistic digital doubles of actors and potentially hours of animated dialogue per character. We demonstrate compelling animation inference in challenging areas such as eyes and lips.