CRMay 22, 2018

Author Obfuscation Using Generalised Differential Privacy

Natasha Fernandes, Mark Dras, Annabelle McIver

arXiv:1805.08866v114.012 citations

Originality Synthesis-oriented

AI Analysis

This addresses the lack of robust, privacy-guaranteed methods for author obfuscation, which is important for protecting anonymity in text-based applications, though it appears incremental as it adapts existing privacy frameworks to a new domain.

The paper tackled the problem of obfuscating authorship in text documents by applying generalized differential privacy, which allows for privacy guarantees in arbitrary datasets with a metric, to author obfuscation using existing stylometry and NLP tools.

The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary's auxiliary knowledge which makes it difficult to reason about the privacy properties of these methods. Differential privacy is a well-known and robust privacy approach, but its reliance on the notion of adjacency between datasets has prevented its application to text document privacy. However, generalised differential privacy permits the application of differential privacy to arbitrary datasets endowed with a metric and has been demonstrated on problems involving the release of individual data points. In this paper we show how to apply generalised differential privacy to author obfuscation by utilising existing tools and methods from the stylometry and natural language processing literature.

View on arXiv PDF

Similar