Say Something Else: Rethinking Contextual Privacy as Information Sufficiency

Yunze Xiao, Wenkai Li, Xiaoyuan Wu, Ningshan Ma, Yueqi Song, Weihao Xuan

arXiv:2604.0640977.2h-index: 1

AI Analysis

This addresses privacy risks in LLM communication for users, offering a more complete strategy and evaluation framework, though it is incremental in extending existing methods.

The paper tackled the problem of privacy in LLM-generated messages by formalizing it as an Information Sufficiency task and introducing free-text pseudonymization as a new strategy, finding that pseudonymization provides the best privacy-utility tradeoff and single-message evaluation underestimates leakage by up to 16.3 percentage points.

LLM agents increasingly draft messages on behalf of users, yet users routinely overshare sensitive information and disagree on what counts as private. Existing systems support only suppression (omitting sensitive information) and generalization (replacing information with an abstraction), and are typically evaluated on single isolated messages, leaving both the strategy space and evaluation setting incomplete. We formalize privacy-preserving LLM communication as an \textbf{Information Sufficiency (IS)} task, introduce \textbf{free-text pseudonymization} as a third strategy that replaces sensitive attributes with functionally equivalent alternatives, and propose a \textbf{conversational evaluation protocol} that assesses strategies under realistic multi-turn follow-up pressure. Across 792 scenarios spanning three power-relation types (institutional, peer, intimate) and three sensitivity categories (discrimination risk, social cost, boundary), we evaluate seven frontier LLMs on privacy at two granularities, covertness, and utility. Pseudonymization yields the strongest privacy\textendash utility tradeoff overall, and single-message evaluation systematically underestimates leakage, with generalization losing up to 16.3 percentage points of privacy under follow-up.

View on arXiv PDF

Similar