InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution
This addresses a practical limitation in image super-resolution for applications like remote sensing, though it is incremental as it builds on existing diffusion models.
The paper tackles the problem of applying diffusion-based super-resolution to arbitrary-sized images without visible seams, proposing InfScene-SR to enable spatially continuous super-resolution for large scenes, and demonstrates it eliminates boundary artifacts and improves perceptual quality on remote sensing datasets.
Image Super-Resolution (SR) aims to recover high-resolution (HR) details from low-resolution (LR) inputs, a task where Denoising Diffusion Probabilistic Models (DDPMs) have recently shown superior performance compared to Generative Adversarial Networks (GANs) based approaches. However, standard diffusion-based SR models, such as SR3, are typically trained on fixed-size patches and struggle to scale to arbitrary-sized images due to memory constraints. Applying these models via independent patch processing leads to visible seams and inconsistent textures across boundaries. In this paper, we propose InfScene-SR, a framework enabling spatially continuous super-resolution for large, arbitrary scenes. We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism, allowing for the seamless generation of large-scale high-resolution imagery without retraining. We validate our approach on remote sensing datasets, demonstrating that InfScene-SR not only reconstructs fine details with high perceptual quality but also eliminates boundary artifacts, benefiting downstream tasks such as semantic segmentation.