CVAug 19, 2024

3D-Consistent Human Avatars with Sparse Inputs via Gaussian Splatting and Contrastive Learning

arXiv:2408.09663v38 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of creating high-quality human avatars for applications like virtual reality or animation when only limited input data is available, representing a strong incremental improvement over prior skeleton-driven approaches.

The paper tackles the problem of generating 3D-consistent human avatars from sparse inputs, where existing methods struggle with degraded detail reconstruction. The proposed CHASE framework achieves dense-input-level performance using only sparse inputs, surpassing state-of-the-art methods on ZJU-MoCap and H36M datasets.

Existing approaches for human avatar generation--both NeRF-based and 3D Gaussian Splatting (3DGS) based--struggle with maintaining 3D consistency and exhibit degraded detail reconstruction, particularly when training with sparse inputs. To address this challenge, we propose CHASE, a novel framework that achieves dense-input-level performance using only sparse inputs through two key innovations: cross-pose intrinsic 3D consistency supervision and 3D geometry contrastive learning. Building upon prior skeleton-driven approaches that combine rigid deformation with non-rigid cloth dynamics, we first establish baseline avatars with fundamental 3D consistency. To enhance 3D consistency under sparse inputs, we introduce a Dynamic Avatar Adjustment (DAA) module, which refines deformed Gaussians by leveraging similar poses from the training set. By minimizing the rendering discrepancy between adjusted Gaussians and reference poses, DAA provides additional supervision for avatar reconstruction. We further maintain global 3D consistency through a novel geometry-aware contrastive learning strategy. While designed for sparse inputs, CHASE surpasses state-of-the-art methods across both full and sparse settings on ZJU-MoCap and H36M datasets, demonstrating that our enhanced 3D consistency leads to superior rendering quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes