CATformer: Contrastive Adversarial Transformer for Image Super-Resolution
This work addresses image super-resolution for practical applications, bridging performance gaps among transformer-, diffusion-, and GAN-based methods, but it appears incremental as it combines existing techniques.
The paper tackles the problem of enhancing low-resolution images by introducing CATformer, a neural network that integrates diffusion-inspired feature refinement with adversarial and contrastive learning, and it outperforms recent transformer-based and diffusion-inspired methods in efficiency and visual quality on benchmark datasets.
Super-resolution remains a promising technique to enhance the quality of low-resolution images. This study introduces CATformer (Contrastive Adversarial Transformer), a novel neural network integrating diffusion-inspired feature refinement with adversarial and contrastive learning. CATformer employs a dual-branch architecture combining a primary diffusion-inspired transformer, which progressively refines latent representations, with an auxiliary transformer branch designed to enhance robustness to noise through learned latent contrasts. These complementary representations are fused and decoded using deep Residual-in-Residual Dense Blocks for enhanced reconstruction quality. Extensive experiments on benchmark datasets demonstrate that CATformer outperforms recent transformer-based and diffusion-inspired methods both in efficiency and visual image quality. This work bridges the performance gap among transformer-, diffusion-, and GAN-based methods, laying a foundation for practical applications of diffusion-inspired transformers in super-resolution.