LG CRJan 8, 2025

Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models

arXiv:2501.04323v31 citationsh-index: 2WWW

Originality Incremental advance

AI Analysis

This addresses privacy and cost trade-offs for real-world deployments of large language models, though it is incremental as it builds on existing architectures like split learning and offsite tuning.

The paper tackles the conflict between model providers' intellectual property, clients' data privacy, and tuning costs in fine-tuning large language models by proposing GuardedTuning, a series of designs that protect against data reconstruction attacks while maintaining competitive fine-tuning performance.

Instruction tuning has proven effective in enhancing Large Language Models' (LLMs) performance on downstream tasks. However, real-world fine-tuning faces inherent conflicts between model providers' intellectual property protection, clients' data privacy requirements, and tuning costs. While recent approaches like split learning and offsite tuning demonstrate promising architectures for privacy-preserving fine-tuning, there is a gap in systematically addressing the multidimensional trade-offs required for diverse real-world deployments. We propose several indicative evaluation metrics to guide design trade-offs for privacy-preserving fine-tuning and a series of example designs, collectively named GuardedTuning; they result from novel combinations of system architectures with adapted privacy-enhancement methods and emerging computation techniques. Each design represents distinct trade-offs across model utility, privacy guarantees, and costs. Experimental results demonstrate that these designs protect against data reconstruction attacks while maintaining competitive fine-tuning performance.

View on arXiv PDF

Similar