CVNov 30, 2021

Human Imperceptible Attacks and Applications to Improve Fairness

Xinru Hua, Huanzhong Xu, Jose Blanchet, Viet Nguyen

arXiv:2111.15603v14.74 citations

Originality Incremental advance

AI Analysis

This work addresses security and fairness issues in AI systems for applications like image classification, though it is incremental as it builds on existing DRO and attack methods.

The paper tackles the problem of designing human-imperceptible adversarial attacks that degrade neural network performance, introducing a Distributionally Robust Optimization (DRO) framework that integrates human-based image quality assessment to generate better-quality attacks than state-of-the-art methods, and demonstrates that DRO training with these attacks can improve group fairness in image classification.

Modern neural networks are able to perform at least as well as humans in numerous tasks involving object classification and image generation. However, small perturbations which are imperceptible to humans may significantly degrade the performance of well-trained deep neural networks. We provide a Distributionally Robust Optimization (DRO) framework which integrates human-based image quality assessment methods to design optimal attacks that are imperceptible to humans but significantly damaging to deep neural networks. Through extensive experiments, we show that our attack algorithm generates better-quality (less perceptible to humans) attacks than other state-of-the-art human imperceptible attack methods. Moreover, we demonstrate that DRO training using our optimally designed human imperceptible attacks can improve group fairness in image classification. Towards the end, we provide an algorithmic implementation to speed up DRO training significantly, which could be of independent interest.

View on arXiv PDF

Similar