CVJun 24, 2024

ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance

arXiv:2406.16476v125 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of scaling diffusion models to higher resolutions for applications in image synthesis and editing, representing a strong incremental improvement over existing methods.

The paper tackles the problem of generating high-resolution images (e.g., 4K) with diffusion models, which often suffer from over-smoothed content and structural distortions, by introducing ResMaster, a training-free method that uses structural and fine-grained guidance to produce high-quality images beyond resolution limits, setting a new benchmark in the field.

Diffusion models excel at producing high-quality images; however, scaling to higher resolutions, such as 4K, often results in over-smoothed content, structural distortions, and repetitive patterns. To this end, we introduce ResMaster, a novel, training-free method that empowers resolution-limited diffusion models to generate high-quality images beyond resolution restrictions. Specifically, ResMaster leverages a low-resolution reference image created by a pre-trained diffusion model to provide structural and fine-grained guidance for crafting high-resolution images on a patch-by-patch basis. To ensure a coherent global structure, ResMaster meticulously aligns the low-frequency components of high-resolution patches with the low-resolution reference at each denoising step. For fine-grained guidance, tailored image prompts based on the low-resolution reference and enriched textual prompts produced by a vision-language model are incorporated. This approach could significantly mitigate local pattern distortions and improve detail refinement. Extensive experiments validate that ResMaster sets a new benchmark for high-resolution image generation and demonstrates promising efficiency. The project page is https://shuweis.github.io/ResMaster .

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes