CRMay 7

SnapAudit: Active Auditing of Differentially Private In-Context Learning via Snapshot-Based Simulation

arXiv:2511.1350250.81 citationsh-index: 4

Predicted impact top 36% in CR · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and practitioners deploying DP-ICL, SnapAudit provides an efficient and reliable auditing tool that reveals previously undetected privacy violations in existing mechanisms.

SnapAudit proposes a framework for auditing differentially private in-context learning (DP-ICL) pipelines that decomposes the pipeline into a deterministic clean-inference stage and a stochastic DP-noise stage, achieving 80-200x speedup over prior methods while producing tighter and more stable empirical privacy estimates. It uncovers two concrete flaws in existing DP-ICL designs: Gaussian noise calibrations underestimate leakage at large privacy budgets, and sensitivity analysis of an embedding-aggregation mechanism is incorrect when partitions equal one.

In-context learning (ICL) allows LLMs to adapt to new tasks via a few demonstrations, but those demonstrations may contain sensitive data. Differentially private (DP) ICL mechanisms mitigate this risk by injecting noise into the aggregation step, but verifying that an implementation actually meets its claimed privacy bound currently requires repeated end-to-end membership-inference attacks (MIAs) against the pipeline as a black box, incurring prohibitive LLM cost and yielding unstable empirical privacy estimates. We propose SnapAudit, an active auditing framework that decomposes a DP-ICL pipeline into a deterministic clean-inference stage and a stochastic DP-noise stage, and audits the full pipeline by combining a small snapshot of the former with bootstrap simulation of the latter. Because clean LLM outputs are near-deterministic at temperature zero, a few thousand clean LLM calls suffice to approximate the snapshot distribution; SnapAudit then bootstraps $10^5$ noisy trials from this snapshot at negligible additional cost, with finite-sample uncertainty controlled via an empirical Bernstein correction. For embedding-based mechanisms, we further introduce a multi-sweep search procedure that constructs maximally separable audit signals. SnapAudit achieves $80$--$200\times$ speedup over prior passive auditing while producing tighter and more stable empirical privacy estimates that closely match theoretical guarantees. Beyond efficiency, SnapAudit uncovers two concrete flaws in existing DP-ICL designs: (i) classical Gaussian noise calibrations underestimate leakage at large privacy budgets, allowing empirical leakage to exceed the theoretical bound; (ii) the sensitivity analysis of an embedding-aggregation mechanism is incorrect when the number of partitions equals one, leading to undersized noise and an outright privacy violation.

View on arXiv PDF

Similar