CVJul 24, 2025

Learning Efficient and Generalizable Human Representation with Human Gaussian Model

arXiv:2507.18758v12 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses a domain-specific challenge in 3D human avatar modeling, offering an incremental improvement over existing feed-forward methods.

The paper tackles the problem of modeling animatable human avatars from videos by proposing a Human Gaussian Graph to capture relations between Gaussians across frames, resulting in improved efficiency and generalization for novel view synthesis and pose animation.

Modeling animatable human avatars from videos is a long-standing and challenging problem. While conventional methods require per-instance optimization, recent feed-forward methods have been proposed to generate 3D Gaussians with a learnable network. However, these methods predict Gaussians for each frame independently, without fully capturing the relations of Gaussians from different timestamps. To address this, we propose Human Gaussian Graph to model the connection between predicted Gaussians and human SMPL mesh, so that we can leverage information from all frames to recover an animatable human representation. Specifically, the Human Gaussian Graph contains dual layers where Gaussians are the first layer nodes and mesh vertices serve as the second layer nodes. Based on this structure, we further propose the intra-node operation to aggregate various Gaussians connected to one mesh vertex, and inter-node operation to support message passing among mesh node neighbors. Experimental results on novel view synthesis and novel pose animation demonstrate the efficiency and generalization of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes