CLApr 17, 2024

Sampling-based Pseudo-Likelihood for Membership Inference Attacks

arXiv:2404.11262v125 citationsh-index: 19ACL
Originality Incremental advance
AI Analysis

This addresses privacy and security risks for users of proprietary LLMs like ChatGPT or Claude 3, but is incremental as it adapts existing methods to work without likelihoods.

The study tackled the problem of detecting data leaks in large language models (LLMs) when likelihoods are unavailable, by proposing a sampling-based pseudo-likelihood method for membership inference attacks, achieving performance comparable to existing likelihood-based methods.

Large Language Models (LLMs) are trained on large-scale web data, which makes it difficult to grasp the contribution of each text. This poses the risk of leaking inappropriate data such as benchmarks, personal information, and copyrighted texts in the training data. Membership Inference Attacks (MIA), which determine whether a given text is included in the model's training data, have been attracting attention. Previous studies of MIAs revealed that likelihood-based classification is effective for detecting leaks in LLMs. However, the existing methods cannot be applied to some proprietary models like ChatGPT or Claude 3 because the likelihood is unavailable to the user. In this study, we propose a Sampling-based Pseudo-Likelihood (\textbf{SPL}) method for MIA (\textbf{SaMIA}) that calculates SPL using only the text generated by an LLM to detect leaks. The SaMIA treats the target text as the reference text and multiple outputs from the LLM as text samples, calculates the degree of $n$-gram match as SPL, and determines the membership of the text in the training data. Even without likelihoods, SaMIA performed on par with existing likelihood-based methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes