CR CVDec 20, 2024

Technical Report for ICML 2024 TiFA Workshop MLLM Attack Challenge: Suffix Injection and Projected Gradient Descent Can Easily Fool An MLLM

Yangyang Guo, Ziwei Xu, Xilie Xu, YongKang Wong, Liqiang Nie, Mohan Kankanhalli

arXiv:2412.15614v12.3h-index: 9

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for security researchers working on adversarial attacks against multimodal AI systems.

The authors tackled the challenge of attacking multimodal large language models by combining suffix injection with projected gradient descent perturbations, achieving top-ranked performance in the TiFA workshop MLLM attack challenge by successfully fooling the LLaVA 1.5 model.

This technical report introduces our top-ranked solution that employs two approaches, \ie suffix injection and projected gradient descent (PGD) , to address the TiFA workshop MLLM attack challenge. Specifically, we first append the text from an incorrectly labeled option (pseudo-labeled) to the original query as a suffix. Using this modified query, our second approach applies the PGD method to add imperceptible perturbations to the image. Combining these two techniques enables successful attacks on the LLaVA 1.5 model.

View on arXiv PDF

Similar