CLAIMar 10, 2025

Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs

arXiv:2503.07384v28 citationsh-index: 42
Originality Synthesis-oriented
AI Analysis

It addresses data privacy concerns in AI by providing a tool for auditing models to ensure transparency and ethical compliance, though it is incremental as it applies an existing method to a new domain.

This work adapted a gradient-based membership inference test to determine if text data was used in training large language models, achieving AUC scores between 85% and 99% across various models and datasets.

This work adapts and studies the gradient-based Membership Inference Test (gMINT) to the classification of text based on LLMs. MINT is a general approach intended to determine if given data was used for training machine learning models, and this work focuses on its application to the domain of Natural Language Processing. Using gradient-based analysis, the MINT model identifies whether particular data samples were included during the language model training phase, addressing growing concerns about data privacy in machine learning. The method was evaluated in seven Transformer-based models and six datasets comprising over 2.5 million sentences, focusing on text classification tasks. Experimental results demonstrate MINTs robustness, achieving AUC scores between 85% and 99%, depending on data size and model architecture. These findings highlight MINTs potential as a scalable and reliable tool for auditing machine learning models, ensuring transparency, safeguarding sensitive data, and fostering ethical compliance in the deployment of AI/NLP technologies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes