CV AI CL CR LGNov 5, 2024

Membership Inference Attacks against Large Vision-Language Models

Zhan Li, Yongtao Wu, Yihang Chen, Francesco Tonin, Elias Abad Rocamora, Volkan Cevher

arXiv:2411.02902v121.532 citationsh-index: 61Has CodeNIPS

Originality Incremental advance

AI Analysis

This work addresses data security concerns for users of large vision-language models, particularly regarding privacy risks from sensitive information in training datasets, and is incremental by adapting existing MIA techniques to a new multimodal context.

The authors tackled the problem of detecting sensitive training data in large vision-language models by introducing the first membership inference attack benchmark and a novel token-level image detection pipeline, achieving results that demonstrate the vulnerability of these models to data leakage.

Large vision-language models (VLLMs) exhibit promising capabilities for processing multi-modal tasks across various application scenarios. However, their emergence also raises significant data security concerns, given the potential inclusion of sensitive information, such as private photos and medical records, in their training datasets. Detecting inappropriately used data in VLLMs remains a critical and unresolved issue, mainly due to the lack of standardized datasets and suitable methodologies. In this study, we introduce the first membership inference attack (MIA) benchmark tailored for various VLLMs to facilitate training data detection. Then, we propose a novel MIA pipeline specifically designed for token-level image detection. Lastly, we present a new metric called MaxRényi-K%, which is based on the confidence of the model output and applies to both text and image data. We believe that our work can deepen the understanding and methodology of MIAs in the context of VLLMs. Our code and datasets are available at https://github.com/LIONS-EPFL/VL-MIA.

View on arXiv PDF Code

Similar