Public Data Assisted Differentially Private In-Context Learning
This addresses privacy risks in ICL for users of large language models, representing an incremental improvement by integrating public data into existing DP frameworks.
The paper tackles the problem of private data leakage in in-context learning (ICL) for large language models by incorporating task-related public data to improve utility while maintaining differential privacy guarantees, demonstrating significant utility improvements and robustness against membership inference attacks.
In-context learning (ICL) in Large Language Models (LLMs) has shown remarkable performance across various tasks without requiring fine-tuning. However, recent studies have highlighted the risk of private data leakage through the prompt in ICL, especially when LLMs are exposed to malicious attacks. While differential privacy (DP) provides strong privacy guarantees, it often significantly reduces the utility of in-context learning (ICL). To address this challenge, we incorporate task-related public data into the ICL framework while maintaining the DP guarantee. Based on this approach, we propose a private in-context learning algorithm that effectively balances privacy protection and model utility. Through experiments, we demonstrate that our approach significantly improves the utility of private ICL with the assistance of public data. Additionally, we show that our method is robust against membership inference attacks, demonstrating empirical privacy protection.