CLNov 18, 2021

Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text

Pawan Kalyan Jada, Konthala Yasaswini, Karthik Puranik, Anbukkarasi Sampath, Sathiyaraj Thangasamy, Kingston Pal Thamburaj

arXiv:2111.09836v10.51 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of moderating harmful content on social media for users in Dravidian language communities, but it is incremental as it applies existing Transformer methods to a specific dataset.

The paper tackled the problem of detecting offensive text in informal, code-mixed social media content in Tamil and Malayalam languages, achieving top-8 results in the HASOC shared task.

To tackle the conundrum of detecting offensive comments/posts which are considerably informal, unstructured, miswritten and code-mixed, we introduce two inventive methods in this research paper. Offensive comments/posts on the social media platforms, can affect an individual, a group or underage alike. In order to classify comments/posts in two popular Dravidian languages, Tamil and Malayalam, as a part of the HASOC - DravidianCodeMix FIRE 2021 shared task, we employ two Transformer-based prototypes which successfully stood in the top 8 for all the tasks. The codes for our approach can be viewed and utilized.

View on arXiv PDF

Similar