Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples
This work addresses security concerns for vision foundation models, particularly in security-sensitive applications, and is incremental as it applies known adversarial attack methods to a new model.
The paper tackles the vulnerability of the Segment Anything Model (SAM) to adversarial examples, finding that it can be fooled to remove masks or generate any desired mask in white-box and black-box settings.
Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, deep vision models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation. Such vulnerability to adversarial attacks causes serious concerns when applying deep models to security-sensitive applications. Therefore, it is critical to know whether the vision foundation model SAM can also be fooled by adversarial attacks. To the best of our knowledge, our work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples. With the basic attack goal set to mask removal, we investigate the adversarial robustness of SAM in the full white-box setting and transfer-based black-box settings. Beyond the basic goal of mask removal, we further investigate and find that it is possible to generate any desired mask by the adversarial attack.