LG CLApr 23, 2024

Rethinking LLM Memorization through the Lens of Adversarial Compression

Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

arXiv:2404.15146v335.1108 citationsh-index: 58NIPS

Originality Incremental advance

AI Analysis

This work addresses concerns about permissible data usage and memorization in LLMs for model owners and legal compliance, though it is incremental as it builds on existing notions of memorization.

The authors tackled the problem of defining and measuring memorization in large language models by proposing the Adversarial Compression Ratio (ACR) as a metric, which assesses memorization based on whether training data strings can be elicited by prompts shorter than the strings themselves, offering a practical tool for legal compliance and data usage monitoring.

Large language models (LLMs) trained on web-scale datasets raise substantial concerns regarding permissible data usage. One major question is whether these models "memorize" all their training data or they integrate many data sources in some way more akin to how a human would learn and synthesize information. The answer hinges, to a large degree, on how we define memorization. In this work, we propose the Adversarial Compression Ratio (ACR) as a metric for assessing memorization in LLMs. A given string from the training data is considered memorized if it can be elicited by a prompt (much) shorter than the string itself -- in other words, if these strings can be "compressed" with the model by computing adversarial prompts of fewer tokens. The ACR overcomes the limitations of existing notions of memorization by (i) offering an adversarial view of measuring memorization, especially for monitoring unlearning and compliance; and (ii) allowing for the flexibility to measure memorization for arbitrary strings at a reasonably low compute. Our definition serves as a practical tool for determining when model owners may be violating terms around data usage, providing a potential legal tool and a critical lens through which to address such scenarios.

View on arXiv PDF

Similar