CV LGNov 20, 2024

MGHF: Multi-Granular High-Frequency Perceptual Loss for Image Super-Resolution

Shoaib Meraj Sami, Md Mahedi Hasan, Mohammad Saeed Ebrahimi Saadabadi, Jeremy Dawson, Nasser Nasrabadi, Raghuveer Rao

arXiv:2411.13548v25.22 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of synthesizing realistic and detailed high-resolution images for computer vision applications, representing an incremental improvement over existing perceptual loss methods.

The paper tackles the problem of information loss and complexity in perceptual losses for image super-resolution by proposing an invertible neural network-based multi-granular high-frequency perceptual loss, which significantly improves performance across various super-resolution algorithms, including GAN- and diffusion-based methods.

While different variants of perceptual losses have been employed in super-resolution literature to synthesize more realistic, appealing, and detailed high-resolution images, most are convolutional neural networks-based, causing information loss during guidance and often relying on complicated architectures and training procedures. We propose an invertible neural network (INN)-based naive \textbf{M}ulti-\textbf{G}ranular \textbf{H}igh-\textbf{F}requency (MGHF-n) perceptual loss trained on ImageNet to overcome these issues. Furthermore, we develop a comprehensive framework (MGHF-c) with several constraints to preserve, prioritize, and regularize information across multiple perspectives: texture and style preservation, content preservation, regional detail preservation, and joint content-style regularization. Information is prioritized through adaptive entropy-based pruning and reweighting of INN features. We utilize Gram matrix loss for style preservation and mean-squared error loss for content preservation. Additionally, we propose content-style consistency through correlation loss to regulate unnecessary texture generation while preserving content information. Since small image regions may contain intricate details, we employ modulated PatchNCE in the INN features as a local information preservation objective. Extensive experiments on various super-resolution algorithms, including GAN- and diffusion-based methods, demonstrate that our MGHF framework significantly improves performance. After the review process, our code will be released in the public repository.

View on arXiv PDF

Similar