CRCLMay 29

LLM Anonymization Against Agentic Re-Identificatio

arXiv:2605.3084878.7h-index: 3
AI Analysis

This work is significant for researchers and practitioners dealing with sensitive textual data, as it provides a method to anonymize text against sophisticated agentic re-identification attacks while retaining crucial analytical utility, which is an incremental improvement over existing methods.

This paper addresses the challenge of anonymizing text from real-user interview transcripts against re-identification by web-search agents while preserving analytical utility. The authors introduce AURA, an LLM-powered mask-reconstruct framework that improves the privacy-utility frontier by adaptively scoping privacy and using a mask-reconstruct method to better preserve contextual utility.

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text for formal privacy, or test rewritten text against non-web inference models, leaving underexplored the operating region between resistance to agentic web-search re-identification and utility retention. We introduce AURA (\textbf{A}nonymization with \textbf{U}tility-\textbf{R}etention \textbf{A}daptation), an LLM-powered \textit{mask-reconstruct} framework that decouples privacy localization from utility-preserving reconstruction and selects candidates with adversarial privacy and utility-retention checks. We evaluate AURA on real-user interview transcripts using re-identification attacks carried out by web-search agents, along with a utility evaluation based on interviewee-profile facts, codebook facts, and the joint contextual utility grid. Our results show that AURA improves the privacy-utility frontier by using adaptive privacy scope to strengthen resistance to agentic re-identification and using a mask-reconstruct anonymization method to better preserve contextual utility under fixed privacy scope.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes