LGAICRCYSep 5, 2024

Privacy Bias in Language Models: A Contextual Integrity-based Auditing Metric

arXiv:2409.03735v34 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses privacy violations in LLMs for model trainers, service providers, and policymakers, but it is incremental as it builds on existing contextual integrity frameworks.

The paper tackles the problem of privacy biases in large language models (LLMs) by defining privacy bias as the appropriateness of information flows and introducing a contextual integrity-based auditing metric to examine these biases, finding that model capacities and optimizations affect privacy bias deltas.

As large language models (LLMs) are integrated into sociotechnical systems, it is crucial to examine the privacy biases they exhibit. We define privacy bias as the appropriateness value of information flows in responses from LLMs. A deviation between privacy biases and expected values, referred to as privacy bias delta, may indicate privacy violations. As an auditing metric, privacy bias can help (a) model trainers evaluate the ethical and societal impact of LLMs, (b) service providers select context-appropriate LLMs, and (c) policymakers assess the appropriateness of privacy biases in deployed LLMs. We formulate and answer a novel research question: how can we reliably examine privacy biases in LLMs and the factors that influence them? We present a novel approach for assessing privacy biases using a contextual integrity-based methodology to evaluate the responses from various LLMs. Our approach accounts for the sensitivity of responses across prompt variations, which hinders the evaluation of privacy biases. Finally, we investigate how privacy biases are affected by model capacities and optimizations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes