Conformer and Blind Noisy Students for Improved Image Quality Assessment
This work addresses the need for better perceptual quality metrics in image generation and restoration, which is crucial for developers and researchers in computer vision, though it is incremental as it builds on existing IQA methods.
The paper tackled the problem of perceptual image quality assessment (IQA) for generated images, where traditional metrics like PSNR or SSIM fail to align with human opinion, by proposing transformer-based full-reference models and a semi-supervised knowledge distillation method for blind IQA, achieving competitive results with rankings of 4th and 3rd in the NTIRE 2022 challenge.
Generative models for image restoration, enhancement, and generation have significantly improved the quality of the generated images. Surprisingly, these models produce more pleasant images to the human eye than other methods, yet, they may get a lower perceptual quality score using traditional perceptual quality metrics such as PSNR or SSIM. Therefore, it is necessary to develop a quantitative metric to reflect the performance of new algorithms, which should be well-aligned with the person's mean opinion score (MOS). Learning-based approaches for perceptual image quality assessment (IQA) usually require both the distorted and reference image for measuring the perceptual quality accurately. However, commonly only the distorted or generated image is available. In this work, we explore the performance of transformer-based full-reference IQA models. We also propose a method for IQA based on semi-supervised knowledge distillation from full-reference teacher models into blind student models using noisy pseudo-labeled data. Our approaches achieved competitive results on the NTIRE 2022 Perceptual Image Quality Assessment Challenge: our full-reference model was ranked 4th, and our blind noisy student was ranked 3rd among 70 participants, each in their respective track.