CVGRApr 1

Autoregressive Appearance Prediction for 3D Gaussian Avatars

arXiv:2604.0092846.0
AI Analysis

This addresses the challenge of producing photorealistic and stable avatars for immersive experiences, though it appears incremental as it builds on existing 3D Gaussian Splatting methods.

The paper tackles the problem of unstable appearance changes in 3D human avatars by proposing a model that uses an autoregressive predictor to infer appearance latents, resulting in improved temporal smoothness and stability in avatar driving.

A photorealistic and immersive human avatar experience demands capturing fine, person-specific details such as cloth and hair dynamics, subtle facial expressions, and characteristic motion patterns. Achieving this requires large, high-quality datasets, which often introduce ambiguities and spurious correlations when very similar poses correspond to different appearances. Models that fit these details during training can overfit and produce unstable, abrupt appearance changes for novel poses. We propose a 3D Gaussian Splatting avatar model with a spatial MLP backbone that is conditioned on both pose and an appearance latent. The latent is learned during training by an encoder, yielding a compact representation that improves reconstruction quality and helps disambiguate pose-driven renderings. At driving time, our predictor autoregressively infers the latent, producing temporally smooth appearance evolution and improved stability. Overall, our method delivers a robust and practical path to high-fidelity, stable avatar driving.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes