MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control
This work addresses the underexplored challenge of high-quality 3D texture generation for computer graphics and AI applications, offering an incremental improvement over existing pipelines.
The paper tackles the problem of generating accurate and detailed 3D textures from multi-view images, addressing issues like alignment with reference textures, consistency with geometry, and local quality, and achieves state-of-the-art results as validated by human evaluations.
Recently, significant advances have been made in 3D object generation. Building upon the generated geometry, current pipelines typically employ image diffusion models to generate multi-view RGB images, followed by UV texture reconstruction through texture baking. While 3D geometry generation has improved significantly, supported by multiple open-source frameworks, 3D texture generation remains underexplored. In this work, we systematically investigate 3D texture generation through the lens of three core dimensions: reference-texture alignment, geometry-texture consistency, and local texture quality. To tackle these issues, we propose MVPainter, which employs data filtering and augmentation strategies to enhance texture fidelity and detail, and introduces ControlNet-based geometric conditioning to improve texture-geometry alignment. Furthermore, we extract physically-based rendering (PBR) attributes from the generated views to produce PBR meshes suitable for real-world rendering applications. MVPainter achieves state-of-the-art results across all three dimensions, as demonstrated by human-aligned evaluations. To facilitate further research and reproducibility, we also release our full pipeline as an open-source system, including data construction, model architecture, and evaluation tools.