Conversation Modeling to Predict Derailment
This work addresses the need for real-time prediction of conversation derailment to aid moderators and users in maintaining healthy online communities, representing an incremental advance over prior methods.
The paper tackles the problem of predicting when online conversations will derail into personal attacks by proposing a hierarchical transformer framework that integrates utterance-level and conversation-level information, resulting in improved F1 scores on two datasets.
Conversations among online users sometimes derail, i.e., break down into personal attacks. Such derailment has a negative impact on the healthy growth of cyberspace communities. The ability to predict whether ongoing conversations are likely to derail could provide valuable real-time insight to interlocutors and moderators. Prior approaches predict conversation derailment retrospectively without the ability to forestall the derailment proactively. Some works attempt to make dynamic prediction as the conversation develops, but fail to incorporate multisource information, such as conversation structure and distance to derailment. We propose a hierarchical transformer-based framework that combines utterance-level and conversation-level information to capture fine-grained contextual semantics. We propose a domain-adaptive pretraining objective to integrate conversational structure information and a multitask learning scheme to leverage the distance from each utterance to derailment. An evaluation of our framework on two conversation derailment datasets yields improvement over F1 score for the prediction of derailment. These results demonstrate the effectiveness of incorporating multisource information.