How far have we gone in Generative Image Restoration? A study on its capability, limitations and evaluation practices
This work provides a systematic study for researchers and developers in low-level vision, redefining the understanding of modern generative image restoration models and guiding future development.
This study evaluates Generative Image Restoration (GIR) models using a new multi-dimensional evaluation pipeline, revealing critical performance disparities across diverse architectures. It identifies a paradigm shift in failure modes, moving from detail scarcity to detail quality and semantic control.
Generative Image Restoration (GIR) has achieved impressive perceptual realism, but how far have its practical capabilities truly advanced compared with previous methods? To answer this, we present a large-scale study grounded in a new multi-dimensional evaluation pipeline that assesses models on detail, sharpness, semantic correctness, and overall quality. Our analysis covers diverse architectures, including diffusion-based, GAN-based, PSNR-oriented, and general-purpose generation models, revealing critical performance disparities. Furthermore, our analysis uncovers a key evolution in failure modes that signifies a paradigm shift for the perception-oriented low-level vision field. The central challenge is evolving from the previous problem of detail scarcity (under-generation) to the new frontier of detail quality and semantic control (preventing over-generation). We also leverage our benchmark to train a new IQA model that better aligns with human perceptual judgments. Ultimately, this work provides a systematic study of modern generative image restoration models, offering crucial insights that redefine our understanding of their true state and chart a course for future development.