CL CRAug 22, 2022

DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting

Timour Igamberdiev, Thomas Arnold, Ivan Habernal

arXiv:2208.10400v131.2589 citationsh-index: 24Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses transparency and reproducibility issues for researchers and practitioners in privacy-preserving text rewriting, though it is incremental as it builds on existing DP methods.

The authors tackled the lack of reproducibility and transparency in differentially private text rewriting systems by introducing DP-Rewrite, an open-source framework that is modular and customizable, and they demonstrated its utility by detecting a privacy leak in the ADePT system.

Text rewriting with differential privacy (DP) provides concrete theoretical guarantees for protecting the privacy of individuals in textual documents. In practice, existing systems may lack the means to validate their privacy-preserving claims, leading to problems of transparency and reproducibility. We introduce DP-Rewrite, an open-source framework for differentially private text rewriting which aims to solve these problems by being modular, extensible, and highly customizable. Our system incorporates a variety of downstream datasets, models, pre-training procedures, and evaluation metrics to provide a flexible way to lead and validate private text rewriting research. To demonstrate our software in practice, we provide a set of experiments as a case study on the ADePT DP text rewriting system, detecting a privacy leak in its pre-training approach. Our system is publicly available, and we hope that it will help the community to make DP text rewriting research more accessible and transparent.

View on arXiv PDF Code

Similar