Natural Color Fool: Towards Boosting Black-box Unrestricted Attacks
This work addresses the challenge of creating stealthy adversarial examples for machine learning security, representing an incremental advance in black-box attack methods.
The paper tackles the problem of limited black-box attack performance in unrestricted color attacks by proposing Natural Color Fool (NCF), which boosts transferability without damaging image quality, achieving average improvements of 15.0% to 32.9% against normally trained models and 10.0% to 25.3% against defense methods.
Unrestricted color attacks, which manipulate semantically meaningful color of an image, have shown their stealthiness and success in fooling both human eyes and deep neural networks. However, current works usually sacrifice the flexibility of the uncontrolled setting to ensure the naturalness of adversarial examples. As a result, the black-box attack performance of these methods is limited. To boost transferability of adversarial examples without damaging image quality, we propose a novel Natural Color Fool (NCF) which is guided by realistic color distributions sampled from a publicly available dataset and optimized by our neighborhood search and initialization reset. By conducting extensive experiments and visualizations, we convincingly demonstrate the effectiveness of our proposed method. Notably, on average, results show that our NCF can outperform state-of-the-art approaches by 15.0%$\sim$32.9% for fooling normally trained models and 10.0%$\sim$25.3% for evading defense methods. Our code is available at https://github.com/ylhz/Natural-Color-Fool.