CLAIJul 7, 2025

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences

arXiv:2507.05391v21 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses privacy concerns for users of commercial LLM APIs, though it is incremental as it builds on existing methods for data filtering.

The paper tackles the problem of users exposing sensitive data when using LLM APIs by introducing privacy profiles as natural language instructions to control what information is shared, and shows that fine-tuned lightweight local models achieve better privacy preservation while matching or exceeding the performance of larger zero-shot models.

Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Experiments with lightweight local LLMs show that, after fine-tuning, they not only achieve markedly better privacy preservation but also match or exceed the performance of much larger zero-shot models. At the same time, the system still faces challenges in fully adhering to user instructions, underscoring the need for models with a better understanding of user-defined privacy preferences.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes