LGCRSep 21, 2023

Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Microsoft
arXiv:2309.11765v2112 citationsh-index: 25
Originality Incremental advance
AI Analysis

This addresses privacy risks for users of large language models in applications requiring sensitive data, though it is incremental as it builds on existing differential privacy and in-context learning methods.

The paper tackles the problem of in-context learning with large language models on private datasets, which risks leaking sensitive information, by proposing a differentially private algorithm to generate synthetic few-shot demonstrations, achieving competitive performance with strong privacy guarantees.

We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that it can achieve effective ICL. We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions. Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels. These results open up new possibilities for ICL with privacy protection for a broad range of applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes