Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs
This enables efficient face-based AR effects on mobile devices, but it is incremental as it builds on existing neural network approaches for facial geometry.
The paper tackles the problem of real-time 3D facial mesh reconstruction from monocular video for AR applications, achieving super-realtime speeds of 100-1000+ FPS on mobile GPUs with prediction quality comparable to manual annotation variance.
We present an end-to-end neural network-based model for inferring an approximate 3D mesh representation of a human face from single camera input for AR applications. The relatively dense mesh model of 468 vertices is well-suited for face-based AR effects. The proposed model demonstrates super-realtime inference speed on mobile GPUs (100-1000+ FPS, depending on the device and model variant) and a high prediction quality that is comparable to the variance in manual annotations of the same image.