HumanCrafter: Synergizing Generalizable Human Reconstruction and Semantic 3D Segmentation
This addresses the need for more functional 3D human models in computer vision applications, though it appears incremental by integrating existing priors into a multi-task framework.
The paper tackles the problem of limited utility in 3D human reconstruction for tasks like segmentation by proposing HumanCrafter, a unified framework that jointly models appearance and human-part semantics from a single image, surpassing state-of-the-art methods in both 3D human-part segmentation and reconstruction.
Recent advances in generative models have achieved high-fidelity in 3D human reconstruction, yet their utility for specific tasks (e.g., human 3D segmentation) remains constrained. We propose HumanCrafter, a unified framework that enables the joint modeling of appearance and human-part semantics from a single image in a feed-forward manner. Specifically, we integrate human geometric priors in the reconstruction stage and self-supervised semantic priors in the segmentation stage. To address labeled 3D human datasets scarcity, we further develop an interactive annotation procedure for generating high-quality data-label pairs. Our pixel-aligned aggregation enables cross-task synergy, while the multi-task objective simultaneously optimizes texture modeling fidelity and semantic consistency. Extensive experiments demonstrate that HumanCrafter surpasses existing state-of-the-art methods in both 3D human-part segmentation and 3D human reconstruction from a single image.