CR AI LGDec 19, 2025

AlignDP: Hybrid Differential Privacy with Rarity-Aware Protection for LLMs

arXiv:2512.17251v1h-index: 4

Originality Incremental advance

AI Analysis

This addresses privacy risks for LLM users by providing a proactive defense, though it appears incremental as it builds on existing differential privacy methods.

The paper tackled the problem of protecting large language models from extraction and unauthorized fine-tuning by designing AlignDP, a hybrid privacy lock that separates rare and non-rare fields for differential privacy, resulting in rare categories being hidden and frequent categories recovered with small error in simulations.

Large language models are exposed to risks of extraction, distillation, and unauthorized fine-tuning. Existing defenses use watermarking or monitoring, but these act after leakage. We design AlignDP, a hybrid privacy lock that blocks knowledge transfer at the data interface. The key idea is to separate rare and non-rare fields. Rare fields are shielded by PAC indistinguishability, giving effective zero-epsilon local DP. Non-rare fields are privatized with RAPPOR, giving unbiased frequency estimates under local DP. A global aggregator enforces composition and budget. This two-tier design hides rare events and adds controlled noise to frequent events. We prove limits of PAC extension to global aggregation, give bounds for RAPPOR estimates, and analyze utility trade-off. A toy simulation confirms feasibility: rare categories remain hidden, frequent categories are recovered with small error.

View on arXiv PDF

Similar