CLSep 17, 2022

APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations

Katherine Atwell, Sabit Hassan, Malihe Alikhani

arXiv:2209.08207v131.3593 citationsh-index: 17Has Code

Originality Incremental advance

AI Analysis

This addresses the need for more inclusive online environments by providing a novel dataset and model for offensive text style transfer, though it is incremental in applying discourse awareness to a specific domain.

The authors tackled the problem of offensive social media comments by creating a parallel corpus of offensive Reddit comments and their inoffensive counterparts, and introduced discourse-aware style-transfer models that reduce offensiveness while preserving meaning, showing improvements over state-of-the-art models in both automatic metrics and human evaluation.

Using style-transfer models to reduce offensiveness of social media comments can help foster a more inclusive environment. However, there are no sizable datasets that contain offensive texts and their inoffensive counterparts, and fine-tuning pretrained models with limited labeled data can lead to the loss of original meaning in the style-transferred text. To address this issue, we provide two major contributions. First, we release the first publicly-available, parallel corpus of offensive Reddit comments and their style-transferred counterparts annotated by expert sociolinguists. Then, we introduce the first discourse-aware style-transfer models that can effectively reduce offensiveness in Reddit text while preserving the meaning of the original text. These models are the first to examine inferential links between the comment and the text it is replying to when transferring the style of offensive Reddit text. We propose two different methods of integrating discourse relations with pretrained transformer models and evaluate them on our dataset of offensive comments from Reddit and their inoffensive counterparts. Improvements over the baseline with respect to both automatic metrics and human evaluation indicate that our discourse-aware models are better at preserving meaning in style-transferred text when compared to the state-of-the-art discourse-agnostic models.

View on arXiv PDF Code

Similar