CV CR LGFeb 19, 2024

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Jiyao Li, Mingze Ni, Yifei Dong, Tianqing Zhu, Wei Liu

arXiv:2402.11940v48.75 citationsh-index: 4Has CodeMach Intell Res

Originality Incremental advance

AI Analysis

This addresses the robustness of image captioning models, a key intersection of CV and NLP, but is incremental as it builds on existing adversarial attack strategies.

The paper tackles the problem of adversarial attacks on image captioning models by proposing AICAttack, a black-box method using attention-based optimization, which achieves higher attack success rates than existing techniques in experiments on benchmark datasets.

Recent advances in deep learning research have shown remarkable achievements across many tasks in computer vision (CV) and natural language processing (NLP). At the intersection of CV and NLP is the problem of image captioning, where the related models' robustness against adversarial attacks has not been well studied. This paper presents a novel adversarial attack strategy, AICAttack (Attention-based Image Captioning Attack), designed to attack image captioning models through subtle perturbations on images. Operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information. We introduce an attention-based candidate selection mechanism that identifies the optimal pixels to attack, followed by a customised differential evolution method to optimise the perturbations of pixels' RGB values. We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets against multiple victim models. The experimental results demonstrate that our method outperforms current leading-edge techniques by achieving consistently higher attack success rates.

View on arXiv PDF Code

Similar