CLLGMay 26, 2022

Leveraging Dependency Grammar for Fine-Grained Offensive Language Detection using Graph Convolutional Networks

arXiv:2205.13164v1629 citationsh-index: 17
Originality Highly original
AI Analysis

This addresses the issue of false positives in offensive language detection on social media, which can discriminate against protected groups, by providing a more fine-grained and accurate method.

The paper tackles the problem of offensive language detection on Twitter, including identifying the type and target of offense, by proposing SyLSTM, which integrates syntactic and semantic features using Graph Convolutional Networks. The result shows that this approach significantly outperforms the state-of-the-art BERT model with far fewer parameters.

The last few years have witnessed an exponential rise in the propagation of offensive text on social media. Identification of this text with high precision is crucial for the well-being of society. Most of the existing approaches tend to give high toxicity scores to innocuous statements (e.g., "I am a gay man"). These false positives result from over-generalization on the training data where specific terms in the statement may have been used in a pejorative sense (e.g., "gay"). Emphasis on such words alone can lead to discrimination against the classes these systems are designed to protect. In this paper, we address the problem of offensive language detection on Twitter, while also detecting the type and the target of the offence. We propose a novel approach called SyLSTM, which integrates syntactic features in the form of the dependency parse tree of a sentence and semantic features in the form of word embeddings into a deep learning architecture using a Graph Convolutional Network. Results show that the proposed approach significantly outperforms the state-of-the-art BERT model with orders of magnitude fewer number of parameters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes