CLAug 28, 2024

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

UW
arXiv:2408.15666v128 citationsh-index: 33
Originality Highly original
AI Analysis

This addresses the problem of interpretable and controllable authorship obfuscation for privacy and security applications, representing a novel method for a known bottleneck.

The paper tackles authorship obfuscation by developing StyleRemix, which perturbs fine-grained style elements to obscure author identity, outperforming state-of-the-art baselines and larger LLMs in various domains as shown by automatic and human evaluations.

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignoring author-specific stylistic features, resulting in less robust performance overall. To address this, we develop StyleRemix, an adaptive and interpretable obfuscation method that perturbs specific, fine-grained style elements of the original input text. StyleRemix uses pre-trained Low Rank Adaptation (LoRA) modules to rewrite an input specifically along various stylistic axes (e.g., formality and length) while maintaining low computational cost. StyleRemix outperforms state-of-the-art baselines and much larger LLMs in a variety of domains as assessed by both automatic and human evaluation. Additionally, we release AuthorMix, a large set of 30K high-quality, long-form texts from a diverse set of 14 authors and 4 domains, and DiSC, a parallel corpus of 1,500 texts spanning seven style axes in 16 unique directions

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes