CLAICRAug 29, 2024

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action

HarvardStanford
arXiv:2409.00138v3141 citationsh-index: 12Has Code
Originality Incremental advance
AI Analysis

This addresses privacy risks in LM-mediated communication for users and developers, offering a novel evaluation framework but is incremental in building on existing privacy literature and crowdsourced data.

The paper tackled the problem of evaluating privacy norm awareness in language models (LMs) used in personalized communication, proposing PrivacyLens to assess privacy leakage in LM agents' actions, and found that state-of-the-art LMs like GPT-4 and Llama-3-70B leaked sensitive information in 25.68% and 38.69% of cases, respectively, even with privacy-enhancing prompts.

As language models (LMs) are widely utilized in personalized communication scenarios (e.g., sending emails, writing social media posts) and endowed with a certain level of agency, ensuring they act in accordance with the contextual privacy norms becomes increasingly critical. However, quantifying the privacy norm awareness of LMs and the emerging privacy risk in LM-mediated communication is challenging due to (1) the contextual and long-tailed nature of privacy-sensitive cases, and (2) the lack of evaluation approaches that capture realistic application scenarios. To address these challenges, we propose PrivacyLens, a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories, enabling multi-level evaluation of privacy leakage in LM agents' actions. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. Using this dataset, we reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions in an agent setup. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions. We also demonstrate the dynamic nature of PrivacyLens by extending each seed into multiple trajectories to red-team LM privacy leakage risk. Dataset and code are available at https://github.com/SALT-NLP/PrivacyLens.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes