CL LGSep 18, 2021

Text Detoxification using Large Pre-trained Neural Models

David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

arXiv:2109.08914v231.2671 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the issue of toxic content in text for users and platforms, but it is incremental as it builds on existing style transfer and paraphrasing techniques.

The paper tackles the problem of eliminating toxicity in text by presenting two unsupervised methods, which achieve new state-of-the-art results as evaluated through a large-scale comparative study using reference-free metrics.

We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabling BERT to replace mask tokens with a variable number of words. Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. The models are evaluated in a reference-free way using a combination of unsupervised style transfer metrics. Both methods we suggest yield new SOTA results.

View on arXiv PDF Code

Similar