CaliTex: Geometry-Calibrated Attention for View-Coherent 3D Texture Generation
This addresses the issue of inconsistent textures across viewpoints in 3D generation for applications like gaming, VR, and design, representing a novel method for a known bottleneck.
The paper tackled the problem of cross-view inconsistency in 3D texture generation by introducing CaliTex, a framework with geometry-calibrated attention that explicitly aligns attention with 3D structure, resulting in seamless and view-consistent textures that outperform open-source and commercial baselines.
Despite major advances brought by diffusion-based models, current 3D texture generation systems remain hindered by cross-view inconsistency -- textures that appear convincing from one viewpoint often fail to align across others. We find that this issue arises from attention ambiguity, where unstructured full attention is applied indiscriminately across tokens and modalities, causing geometric confusion and unstable appearance-structure coupling. To address this, we introduce CaliTex, a framework of geometry-calibrated attention that explicitly aligns attention with 3D structure. It introduces two modules: Part-Aligned Attention that enforces spatial alignment across semantically matched parts, and Condition-Routed Attention which routes appearance information through geometry-conditioned pathways to maintain spatial fidelity. Coupled with a two-stage diffusion transformer, CaliTex makes geometric coherence an inherent behavior of the network rather than a byproduct of optimization. Empirically, CaliTex produces seamless and view-consistent textures and outperforms both open-source and commercial baselines.