CVSep 11, 2023

Blendshapes GHUM: Real-time Monocular Facial Blendshape Prediction

arXiv:2309.05782v15 citationsh-index: 65
Originality Synthesis-oriented
AI Analysis

This enables on-device facial motion capture for applications like virtual avatars, but is incremental as it builds on existing landmark-based methods.

The paper tackles real-time facial blendshape prediction from monocular images, achieving 30+ FPS on mobile devices with 52 coefficients.

We present Blendshapes GHUM, an on-device ML pipeline that predicts 52 facial blendshape coefficients at 30+ FPS on modern mobile phones, from a single monocular RGB image and enables facial motion capture applications like virtual avatars. Our main contributions are: i) an annotation-free offline method for obtaining blendshape coefficients from real-world human scans, ii) a lightweight real-time model that predicts blendshape coefficients based on facial landmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes