CVMar 19, 2024

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

arXiv:2403.12963v161 citationsHas CodeECCV
Originality Incremental advance
AI Analysis

This work solves the challenge of scaling diffusion models to high resolutions for image synthesis applications, though it is incremental as it builds on existing models with a novel frequency-based adjustment.

The paper tackles the problem of generating high-resolution images from pre-trained diffusion models without training, addressing issues like repetitive patterns and distortions, and achieves arbitrary-size, high-quality synthesis with structural consistency.

In this study, we delve into the generation of high-resolution images from pre-trained diffusion models, addressing persistent challenges, such as repetitive patterns and structural distortions, that emerge when models are applied beyond their trained resolutions. To address this issue, we introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis. We replace the original convolutional layers in pre-trained diffusion models by incorporating a dilation technique along with a low-pass operation, intending to achieve structural consistency and scale consistency across resolutions, respectively. Further enhanced by a padding-then-crop strategy, our method can flexibly handle text-to-image generation of various aspect ratios. By using the FouriScale as guidance, our method successfully balances the structural integrity and fidelity of generated images, achieving an astonishing capacity of arbitrary-size, high-resolution, and high-quality generation. With its simplicity and compatibility, our method can provide valuable insights for future explorations into the synthesis of ultra-high-resolution images. The code will be released at https://github.com/LeonHLJ/FouriScale.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes