CLAICRLGMar 21, 2025

Language Models May Verbatim Complete Text They Were Not Explicitly Trained On

DeepMind
arXiv:2503.17514v224 citationsh-index: 36ICML
Originality Incremental advance
AI Analysis

This reveals a critical flaw in current methods for verifying training data membership in LLMs, which is important for researchers and practitioners concerned with data privacy and model transparency.

The paper demonstrates that n-gram based membership tests for detecting whether text was used to train large language models can be gamed, as models can complete sequences even when they are non-members under such definitions, including cases like exact duplicates and short overlaps.

An important question today is whether a given text was used to train a large language model (LLM). A \emph{completion} test is often employed: check if the LLM completes a sufficiently complex text. This, however, requires a ground-truth definition of membership; most commonly, it is defined as a member based on the $n$-gram overlap between the target text and any text in the dataset. In this work, we demonstrate that this $n$-gram based membership definition can be effectively gamed. We study scenarios where sequences are \emph{non-members} for a given $n$ and we find that completion tests still succeed. We find many natural cases of this phenomenon by retraining LLMs from scratch after removing all training samples that were completed; these cases include exact duplicates, near-duplicates, and even short overlaps. They showcase that it is difficult to find a single viable choice of $n$ for membership definitions. Using these insights, we design adversarial datasets that can cause a given target sequence to be completed without containing it, for any reasonable choice of $n$. Our findings highlight the inadequacy of $n$-gram membership, suggesting membership definitions fail to account for auxiliary information available to the training algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes