Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language Identification
This work addresses the need for automated detection of offensive content in social media, but it is incremental as it builds on existing methods for a shared task.
The paper tackled the problem of identifying offensive language in social media by developing a deep learning model combining character-level CNNs and word-level RNNs, achieving a macro-averaged F1-score of 77.93% for subtask A.
This paper presents the models submitted by Ghmerti team for subtasks A and B of the OffensEval shared task at SemEval 2019. OffensEval addresses the problem of identifying and categorizing offensive language in social media in three subtasks; whether or not a content is offensive (subtask A), whether it is targeted (subtask B) towards an individual, a group, or other entities (subtask C). The proposed approach includes character-level Convolutional Neural Network, word-level Recurrent Neural Network, and some preprocessing. The performance achieved by the proposed model for subtask A is 77.93% macro-averaged F1-score.