CVJun 18, 2024

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

arXiv:2406.12459v266 citations
AI Analysis

This addresses the need for efficient human reconstruction without dense images or per-instance optimization, though it appears incremental as it builds on existing Gaussian Splatting techniques.

The paper tackles the problem of high-fidelity human reconstruction from a single image by presenting HumanSplat, which predicts 3D Gaussian Splatting properties in a generalizable manner, achieving photorealistic novel-view synthesis and surpassing state-of-the-art methods.

Despite recent advancements in high-fidelity human reconstruction techniques, the requirements for densely captured images or time-consuming per-instance optimization significantly hinder their applications in broader scenarios. To tackle these issues, we present HumanSplat which predicts the 3D Gaussian Splatting properties of any human from a single input image in a generalizable manner. In particular, HumanSplat comprises a 2D multi-view diffusion model and a latent reconstruction transformer with human structure priors that adeptly integrate geometric priors and semantic features within a unified framework. A hierarchical loss that incorporates human semantic information is further designed to achieve high-fidelity texture modeling and better constrain the estimated multiple views. Comprehensive experiments on standard benchmarks and in-the-wild images demonstrate that HumanSplat surpasses existing state-of-the-art methods in achieving photorealistic novel-view synthesis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes