CVAug 1, 2023

High-Fidelity Eye Animatable Neural Radiance Fields for Human Face

arXiv:2308.00773v37 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem in computer vision for applications like gaze estimation, but it is incremental as it builds on existing NeRF and FLAME methods to add eye movement modeling.

The paper tackles the problem of modeling eyeball rotation in neural radiance fields (NeRF) for human face rendering, which is often overlooked, and demonstrates that their Dynamic Eye-aware NeRF (DeNeRF) generates high-fidelity images with accurate eyeball rotation and periocular deformation, improving gaze estimation performance on the ETH-XGaze dataset.

Face rendering using neural radiance fields (NeRF) is a rapidly developing research area in computer vision. While recent methods primarily focus on controlling facial attributes such as identity and expression, they often overlook the crucial aspect of modeling eyeball rotation, which holds importance for various downstream tasks. In this paper, we aim to learn a face NeRF model that is sensitive to eye movements from multi-view images. We address two key challenges in eye-aware face NeRF learning: how to effectively capture eyeball rotation for training and how to construct a manifold for representing eyeball rotation. To accomplish this, we first fit FLAME, a well-established parametric face model, to the multi-view images considering multi-view consistency. Subsequently, we introduce a new Dynamic Eye-aware NeRF (DeNeRF). DeNeRF transforms 3D points from different views into a canonical space to learn a unified face NeRF model. We design an eye deformation field for the transformation, including rigid transformation, e.g., eyeball rotation, and non-rigid transformation. Through experiments conducted on the ETH-XGaze dataset, we demonstrate that our model is capable of generating high-fidelity images with accurate eyeball rotation and non-rigid periocular deformation, even under novel viewing angles. Furthermore, we show that utilizing the rendered images can effectively enhance gaze estimation performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes