CVGROct 24, 2024

Real-time 3D-aware Portrait Video Relighting

arXiv:2410.18355v126 citationsh-index: 17CVPR
Originality Highly original
AI Analysis

This enables real-time, interactive video relighting for applications like video conferencing, representing a significant advance over prior methods that were slower or lacked viewpoint adjustment.

The paper tackles the problem of synthesizing realistic videos of talking faces under custom lighting and viewing angles in real-time, achieving state-of-the-art results with a method that runs at 32.98 fps on consumer hardware.

Synthesizing realistic videos of talking faces under custom lighting conditions and viewing angles benefits various downstream applications like video conferencing. However, most existing relighting methods are either time-consuming or unable to adjust the viewpoints. In this paper, we present the first real-time 3D-aware method for relighting in-the-wild videos of talking faces based on Neural Radiance Fields (NeRF). Given an input portrait video, our method can synthesize talking faces under both novel views and novel lighting conditions with a photo-realistic and disentangled 3D representation. Specifically, we infer an albedo tri-plane, as well as a shading tri-plane based on a desired lighting condition for each video frame with fast dual-encoders. We also leverage a temporal consistency network to ensure smooth transitions and reduce flickering artifacts. Our method runs at 32.98 fps on consumer-level hardware and achieves state-of-the-art results in terms of reconstruction quality, lighting error, lighting instability, temporal consistency and inference speed. We demonstrate the effectiveness and interactivity of our method on various portrait videos with diverse lighting and viewing conditions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes