CVAug 17, 2025

Geometry-Aware Video Inpainting for Joint Headset Occlusion Removal and Face Reconstruction in Social XR

arXiv:2508.12336v1h-index: 5J. Electronic Imaging
Originality Incremental advance
AI Analysis

This addresses the need for immersive social XR applications like teleconferencing by enabling facial expression and eye gaze visibility, though it is an incremental improvement combining existing methods.

This study tackled the problem of head-mounted displays obscuring facial features in social XR applications by developing a geometry-aware framework that jointly removes HMD occlusions and reconstructs complete 3D facial geometry from single-view RGB frames, achieving photorealistic outputs with robust performance across different landmark densities.

Head-mounted displays (HMDs) are essential for experiencing extended reality (XR) environments and observing virtual content. However, they obscure the upper part of the user's face, complicating external video recording and significantly impacting social XR applications such as teleconferencing, where facial expressions and eye gaze details are crucial for creating an immersive experience. This study introduces a geometry-aware learning-based framework to jointly remove HMD occlusions and reconstruct complete 3D facial geometry from RGB frames captured from a single viewpoint. The method integrates a GAN-based video inpainting network, guided by dense facial landmarks and a single occlusion-free reference frame, to restore missing facial regions while preserving identity. Subsequently, a SynergyNet-based module regresses 3D Morphable Model (3DMM) parameters from the inpainted frames, enabling accurate 3D face reconstruction. Dense landmark optimization is incorporated throughout the pipeline to improve both the inpainting quality and the fidelity of the recovered geometry. Experimental results demonstrate that the proposed framework can successfully remove HMDs from RGB facial videos while maintaining facial identity and realism, producing photorealistic 3D face geometry outputs. Ablation studies further show that the framework remains robust across different landmark densities, with only minor quality degradation under sparse landmark configurations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes