CVGRIVFeb 23, 2022

Paying U-Attention to Textures: Multi-Stage Hourglass Vision Transformer for Universal Texture Synthesis

arXiv:2202.11703v38 citations
Originality Incremental advance
AI Analysis

This addresses the problem of synthesizing diverse textures while preserving structures for applications in computer graphics and vision, representing a novel method for a known bottleneck.

The paper tackles universal texture synthesis by introducing a U-Attention Vision Transformer with a hierarchical hourglass backbone, achieving stronger 2x synthesis than previous work on both stochastic and structured textures and generalizing to unseen textures without fine-tuning.

We present a novel U-Attention vision Transformer for universal texture synthesis. We exploit the natural long-range dependencies enabled by the attention mechanism to allow our approach to synthesize diverse textures while preserving their structures in a single inference. We propose a hierarchical hourglass backbone that attends to the global structure and performs patch mapping at varying scales in a coarse-to-fine-to-coarse stream. Completed by skip connection and convolution designs that propagate and fuse information at different scales, our hierarchical U-Attention architecture unifies attention to features from macro structures to micro details, and progressively refines synthesis results at successive stages. Our method achieves stronger 2$\times$ synthesis than previous work on both stochastic and structured textures while generalizing to unseen textures without fine-tuning. Ablation studies demonstrate the effectiveness of each component of our architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes