CRAIFeb 23, 2025

RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis

arXiv:2502.18517v24 citationsh-index: 8EMNLP
Originality Incremental advance
AI Analysis

This addresses privacy concerns in sensitive domains like healthcare and finance by enabling fine-tuning without exposing private data, though it is an incremental improvement over existing synthetic data methods.

The paper tackles the problem of privacy-preserving fine-tuning for large language models by generating synthetic data with differential privacy guarantees, but existing methods produce noisy data; they propose RewardDS, which uses a reward proxy model to filter and refine synthetic data, achieving improvements in medical, financial, and code generation domains.

The success of large language models (LLMs) has attracted many individuals to fine-tune them for domain-specific tasks by uploading their data. However, in sensitive areas like healthcare and finance, privacy concerns often arise. One promising solution is to generate synthetic data with Differential Privacy (DP) guarantees to replace private data. However, these synthetic data contain significant flawed data, which are considered as noise. Existing solutions typically rely on naive filtering by comparing ROUGE-L scores or embedding similarities, which are ineffective in addressing the noise. To address this issue, we propose \textit{RewardDS}, a novel privacy-preserving framework that fine-tunes a reward proxy model and uses reward signals to guide the synthetic data generation. Our \textit{RewardDS} introduces two key modules, Reward Guided Filtering and Self-Optimizing Refinement, to both filter and refine the synthetic data, effectively mitigating the noise. Extensive experiments across medical, financial, and code generation domains demonstrate the effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes