Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation
This addresses texture quality issues in 3D generation for applications like computer graphics and virtual reality, representing an incremental improvement over prior methods.
The paper tackled the problem of incomplete texture coverage, cross-view inconsistency, and misalignment in 3D texture generation by proposing Hitem3D 2.0, a multi-view guided native 3D texture generation framework, and demonstrated that it outperforms existing methods in texture detail, fidelity, consistency, coherence, and alignment.
Although recent advances have improved the quality of 3D texture generation, existing methods still struggle with incomplete texture coverage, cross-view inconsistency, and misalignment between geometry and texture. To address these limitations, we propose Hitem3D 2.0, a multi-view guided native 3D texture generation framework that enhances texture quality through the integration of 2D multi-view generation priors and native 3D texture representations. Hitem3D 2.0 comprises two key components: a multi-view synthesis framework and a native 3D texture generation model. The multi-view generation is built upon a pre-trained image editing backbone and incorporates plug-and-play modules that explicitly promote geometric alignment, cross-view consistency, and illumination uniformity, thereby enabling the synthesis of high-fidelity multi-view images. Conditioned on the generated views and 3D geometry, the native 3D texture generation model projects multi-view textures onto 3D surfaces while plausibly completing textures in unseen regions. Through the integration of multi-view consistency constraints with native 3D texture modeling, Hitem3D 2.0 significantly improves texture completeness, cross-view coherence, and geometric alignment. Experimental results demonstrate that Hitem3D 2.0 outperforms existing methods in terms of texture detail, fidelity, consistency, coherence, and alignment.