Constructing Semantics-Aware Adversarial Examples with a Probabilistic Perspective
This work addresses the challenge of creating more effective adversarial attacks for machine learning security, with incremental improvements in semantic preservation and defense evasion.
The paper tackles the problem of generating adversarial examples that preserve image semantics while deceiving classifiers, achieving higher success rates in circumventing adversarial defenses compared to traditional methods.
We propose a probabilistic perspective on adversarial examples, allowing us to embed subjective understanding of semantics as a distribution into the process of generating adversarial examples, in a principled manner. Despite significant pixel-level modifications compared to traditional adversarial attacks, our method preserves the overall semantics of the image, making the changes difficult for humans to detect. This extensive pixel-level modification enhances our method's ability to deceive classifiers designed to defend against adversarial attacks. Our empirical findings indicate that the proposed methods achieve higher success rates in circumventing adversarial defense mechanisms, while remaining difficult for human observers to detect.