CRAIJun 12, 2025

SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks

arXiv:2506.10424v110 citationsh-index: 21Has CodeUSENIX Security Symposium
Originality Highly original
AI Analysis

This addresses privacy concerns for users and organizations fine-tuning LLMs with sensitive data, offering a practical defense against membership inference attacks.

The paper tackled the vulnerability of fine-tuned large language models to membership inference attacks by showing that these attacks exploit loss reduction, and proposed SOFT, a defense that reduces privacy risks while maintaining competitive performance across six domains and multiple model scales.

Large language models (LLMs) have achieved remarkable success and are widely adopted for diverse applications. However, fine-tuning these models often involves private or sensitive information, raising critical privacy concerns. In this work, we conduct the first comprehensive study evaluating the vulnerability of fine-tuned LLMs to membership inference attacks (MIAs). Our empirical analysis demonstrates that MIAs exploit the loss reduction during fine-tuning, making them highly effective in revealing membership information. These findings motivate the development of our defense. We propose SOFT (\textbf{S}elective data \textbf{O}bfuscation in LLM \textbf{F}ine-\textbf{T}uning), a novel defense technique that mitigates privacy leakage by leveraging influential data selection with an adjustable parameter to balance utility preservation and privacy protection. Our extensive experiments span six diverse domains and multiple LLM architectures and scales. Results show that SOFT effectively reduces privacy risks while maintaining competitive model performance, offering a practical and scalable solution to safeguard sensitive information in fine-tuned LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes