Adversarial Image Color Transformations in Explicit Color Filter Space
This work addresses the vulnerability of image classifiers to adversarial attacks by introducing a human-interpretable method, though it is incremental as it builds on existing color transformation approaches.
The authors tackled the problem of adversarial attacks on deep neural networks by proposing Adversarial Color Filter (AdvCF), a novel color transformation attack optimized in an explicit color filter space, which achieved superior image acceptability and efficiency compared to state-of-the-art methods, as demonstrated through user studies and comparisons.
Deep Neural Networks have been shown to be vulnerable to adversarial images. Conventional attacks strive for indistinguishable adversarial images with strictly restricted perturbations. Recently, researchers have moved to explore distinguishable yet non-suspicious adversarial images and demonstrated that color transformation attacks are effective. In this work, we propose Adversarial Color Filter (AdvCF), a novel color transformation attack that is optimized with gradient information in the parameter space of a simple color filter. In particular, our color filter space is explicitly specified so that we are able to provide a systematic analysis of model robustness against adversarial color transformations, from both the attack and defense perspectives. In contrast, existing color transformation attacks do not offer the opportunity for systematic analysis due to the lack of such an explicit space. We further demonstrate the effectiveness of our AdvCF in fooling image classifiers and also compare it with other color transformation attacks regarding their robustness to defenses and image acceptability through an extensive user study. We also highlight the human-interpretability of AdvCF and show its superiority over the state-of-the-art human-interpretable color transformation attack on both image acceptability and efficiency. Additional results provide interesting new insights into model robustness against AdvCF in another three visual tasks.