CVAIMar 19

CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization

arXiv:2603.1912154.0h-index: 12
AI Analysis

This addresses the problem of precise, high-quality texture generation for 3D indoor scenes, offering a more user-friendly editing path, though it appears incremental as it builds on existing text-driven and distillation methods.

The paper tackles the challenge of creating high-fidelity, customizable 3D indoor scene textures by introducing CustomTex, a framework that uses reference images for instance-level control, resulting in textures with superior sharpness, reduced artifacts, and minimal baked-in shading compared to state-of-the-art methods.

The creation of high-fidelity, customizable 3D indoor scene textures remains a significant challenge. While text-driven methods offer flexibility, they lack the precision for fine-grained, instance-level control, and often produce textures with insufficient quality, artifacts, and baked-in shading. To overcome these limitations, we introduce CustomTex, a novel framework for instance-level, high-fidelity scene texturing driven by reference images. CustomTex takes an untextured 3D scene and a set of reference images specifying the desired appearance for each object instance, and generates a unified, high-resolution texture map. The core of our method is a dual-distillation approach that separates semantic control from pixel-level enhancement. We employ semantic-level distillation, equipped with an instance cross-attention, to ensure semantic plausibility and ``reference-instance'' alignment, and pixel-level distillation to enforce high visual fidelity. Both are unified within a Variational Score Distillation (VSD) optimization framework. Experiments demonstrate that CustomTex achieves precise instance-level consistency with reference images and produces textures with superior sharpness, reduced artifacts, and minimal baked-in shading compared to state-of-the-art methods. Our work establishes a more direct and user-friendly path to high-quality, customizable 3D scene appearance editing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes