CV AI GRAug 23, 2023

Pose Modulated Avatars from Video

Chunjin Song, Bastian Wandt, Helge Rhodin

arXiv:2308.11951v33.94 citationsh-index: 30

Originality Incremental advance

AI Analysis

This addresses the problem of noisy artifacts and blurred details in avatar reconstruction for applications like animation and VR, representing an incremental improvement over existing NeRF-based methods.

The paper tackles the challenge of modeling cloth and skin deformation in relation to skeleton pose for human avatars from video, developing a two-branch neural network adaptive in the frequency domain that outperforms state-of-the-art methods in preserving details and generalization.

It is now possible to reconstruct dynamic human motion and shape from a sparse set of cameras using Neural Radiance Fields (NeRF) driven by an underlying skeleton. However, a challenge remains to model the deformation of cloth and skin in relation to skeleton pose. Unlike existing avatar models that are learned implicitly or rely on a proxy surface, our approach is motivated by the observation that different poses necessitate unique frequency assignments. Neglecting this distinction yields noisy artifacts in smooth areas or blurs fine-grained texture and shape details in sharp regions. We develop a two-branch neural network that is adaptive and explicit in the frequency domain. The first branch is a graph neural network that models correlations among body parts locally, taking skeleton pose as input. The second branch combines these correlation features to a set of global frequencies and then modulates the feature encoding. Our experiments demonstrate that our network outperforms state-of-the-art methods in terms of preserving details and generalization capabilities.

View on arXiv PDF

Similar