CRAILGDec 19, 2025

AlignDP: Hybrid Differential Privacy with Rarity-Aware Protection for LLMs

arXiv:2512.17251v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses privacy risks for LLM users by providing a proactive defense, though it appears incremental as it builds on existing differential privacy methods.

The paper tackled the problem of protecting large language models from extraction and unauthorized fine-tuning by designing AlignDP, a hybrid privacy lock that separates rare and non-rare fields for differential privacy, resulting in rare categories being hidden and frequent categories recovered with small error in simulations.

Large language models are exposed to risks of extraction, distillation, and unauthorized fine-tuning. Existing defenses use watermarking or monitoring, but these act after leakage. We design AlignDP, a hybrid privacy lock that blocks knowledge transfer at the data interface. The key idea is to separate rare and non-rare fields. Rare fields are shielded by PAC indistinguishability, giving effective zero-epsilon local DP. Non-rare fields are privatized with RAPPOR, giving unbiased frequency estimates under local DP. A global aggregator enforces composition and budget. This two-tier design hides rare events and adds controlled noise to frequent events. We prove limits of PAC extension to global aggregation, give bounds for RAPPOR estimates, and analyze utility trade-off. A toy simulation confirms feasibility: rare categories remain hidden, frequent categories are recovered with small error.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes