CVFeb 3, 2024

Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units

arXiv:2402.03944v4h-index: 10AAAI
Originality Highly original
AI Analysis

This provides a robust, privacy-enhanced solution for facial motion capture in applications where visual methods fail, such as in VR/AR or under adverse conditions.

The paper tackles facial motion capture without visual signals by using miniaturized Inertial Measurement Units (IMUs) placed on the face, achieving reliable performance in challenging conditions like occlusions and low-light environments.

We present Capturing the Unseen (CAPUS), a novel facial motion capture (MoCap) technique that operates without visual signals. CAPUS leverages miniaturized Inertial Measurement Units (IMUs) as a new sensing modality for facial motion capture. While IMUs have become essential in full-body MoCap for their portability and independence from environmental conditions, their application in facial MoCap remains underexplored. We address this by customizing micro-IMUs, small enough to be placed on the face, and strategically positioning them in alignment with key facial muscles to capture expression dynamics. CAPUS introduces the first facial IMU dataset, encompassing both IMU and visual signals from participants engaged in diverse activities such as multilingual speech, facial expressions, and emotionally intoned auditions. We train a Transformer Diffusion-based neural network to infer Blendshape parameters directly from IMU data. Our experimental results demonstrate that CAPUS reliably captures facial motion in conditions where visual-based methods struggle, including facial occlusions, rapid movements, and low-light environments. Additionally, by eliminating the need for visual inputs, CAPUS offers enhanced privacy protection, making it a robust solution for vision-free facial MoCap.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes