Black-box Adversarial Attacks Against Image Quality Assessment Models
This work addresses security loopholes in NR-IQA models for practical deployment, representing an incremental advancement in adversarial attack research.
The paper tackles the vulnerability of No-Reference Image Quality Assessment (NR-IQA) models by developing the first black-box adversarial attack method, which successfully misleads all evaluated models with generated perturbations that are not transferable, enabling investigation of model specialities.
The goal of No-Reference Image Quality Assessment (NR-IQA) is to predict the perceptual quality of an image in line with its subjective evaluation. To put the NR-IQA models into practice, it is essential to study their potential loopholes for model refinement. This paper makes the first attempt to explore the black-box adversarial attacks on NR-IQA models. Specifically, we first formulate the attack problem as maximizing the deviation between the estimated quality scores of original and perturbed images, while restricting the perturbed image distortions for visual quality preservation. Under such formulation, we then design a Bi-directional loss function to mislead the estimated quality scores of adversarial examples towards an opposite direction with maximum deviation. On this basis, we finally develop an efficient and effective black-box attack method against NR-IQA models. Extensive experiments reveal that all the evaluated NR-IQA models are vulnerable to the proposed attack method. And the generated perturbations are not transferable, enabling them to serve the investigation of specialities of disparate IQA models.