CVMar 17, 2025

R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars

arXiv:2503.12751v16 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the challenge of creating realistic, animatable avatars for applications in virtual reality and gaming, representing a strong incremental advance over existing video-based methods.

The paper tackles the problem of reconstructing photorealistic human avatars that are both animatable and high-fidelity by introducing R3-Avatar with a temporal codebook, achieving improved visual quality in extreme scenarios like limited training poses and complex clothing.

We present R3-Avatar, incorporating a temporal codebook, to overcome the inability of human avatars to be both animatable and of high-fidelity rendering quality. Existing video-based reconstruction of 3D human avatars either focuses solely on rendering, lacking animation support, or learns a pose-appearance mapping for animating, which degrades under limited training poses or complex clothing. In this paper, we adopt a "record-retrieve-reconstruct" strategy that ensures high-quality rendering from novel views while mitigating degradation in novel poses. Specifically, disambiguating timestamps record temporal appearance variations in a codebook, ensuring high-fidelity novel-view rendering, while novel poses retrieve corresponding timestamps by matching the most similar training poses for augmented appearance. Our R3-Avatar outperforms cutting-edge video-based human avatar reconstruction, particularly in overcoming visual quality degradation in extreme scenarios with limited training human poses and complex clothing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes