CV CRDec 10, 2024

Backdoor Attacks against No-Reference Image Quality Assessment Models via a Scalable Trigger

Yi Yu, Song Xia, Xun Lin, Wenhan Yang, Shijian Lu, Yap-peng Tan, Alex Kot

arXiv:2412.07277v314.115 citationsh-index: 11Has CodeAAAI

Originality Highly original

AI Analysis

This work addresses security risks in computer vision systems that rely on NR-IQA, such as low-light enhancement, by exposing a scalable and effective attack method, though it is incremental in the context of adversarial machine learning.

The paper tackles the vulnerability of No-Reference Image Quality Assessment (NR-IQA) models to backdoor attacks by proposing BAIQA, a scalable poisoning-based method that manipulates model outputs to any target value using a trigger in the DCT domain, achieving high attack success rates (e.g., up to 99.9% on some models).

No-Reference Image Quality Assessment (NR-IQA), responsible for assessing the quality of a single input image without using any reference, plays a critical role in evaluating and optimizing computer vision systems, e.g., low-light enhancement. Recent research indicates that NR-IQA models are susceptible to adversarial attacks, which can significantly alter predicted scores with visually imperceptible perturbations. Despite revealing vulnerabilities, these attack methods have limitations, including high computational demands, untargeted manipulation, limited practical utility in white-box scenarios, and reduced effectiveness in black-box scenarios. To address these challenges, we shift our focus to another significant threat and present a novel poisoning-based backdoor attack against NR-IQA (BAIQA), allowing the attacker to manipulate the IQA model's output to any desired target value by simply adjusting a scaling coefficient $α$ for the trigger. We propose to inject the trigger in the discrete cosine transform (DCT) domain to improve the local invariance of the trigger for countering trigger diminishment in NR-IQA models due to widely adopted data augmentations. Furthermore, the universal adversarial perturbations (UAP) in the DCT space are designed as the trigger, to increase IQA model susceptibility to manipulation and improve attack effectiveness. In addition to the heuristic method for poison-label BAIQA (P-BAIQA), we explore the design of clean-label BAIQA (C-BAIQA), focusing on $α$ sampling and image data refinement, driven by theoretical insights we reveal. Extensive experiments on diverse datasets and various NR-IQA models demonstrate the effectiveness of our attacks. Code can be found at https://github.com/yuyi-sd/BAIQA.

View on arXiv PDF Code

Similar