CRCLNEOct 13, 2025

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

arXiv:2510.10990v1h-index: 4
Originality Incremental advance
AI Analysis

This work addresses privacy concerns for high-quality text data used in LLMs and AGI by offering a more practical and effective method for generating synthetic text, though it appears incremental as it builds on existing private evolution frameworks.

The paper tackles the problem of utility loss and computational overhead in differentially private synthetic text generation by proposing Secret-Protected Evolution (SecPE), which achieves lower Fréchet Inception Distance and higher downstream task accuracy across benchmarks while requiring less noise for the same protection level.

Text data has become extremely valuable on large language models (LLMs) and even lead to general artificial intelligence (AGI). A lot of high-quality text in the real world is private and cannot be freely used due to privacy concerns. Therefore, differentially private (DP) synthetic text generation has been proposed, aiming to produce high-utility synthetic data while protecting sensitive information. However, existing DP synthetic text generation imposes uniform guarantees that often overprotect non-sensitive content, resulting in substantial utility loss and computational overhead. Therefore, we propose Secret-Protected Evolution (SecPE), a novel framework that extends private evolution with secret-aware protection. Theoretically, we show that SecPE satisfies $(\mathrm{p}, \mathrm{r})$-secret protection, constituting a relaxation of Gaussian DP that enables tighter utility-privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes