CLNov 18, 2021

Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text

arXiv:2111.09836v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of moderating harmful content on social media for users in Dravidian language communities, but it is incremental as it applies existing Transformer methods to a specific dataset.

The paper tackled the problem of detecting offensive text in informal, code-mixed social media content in Tamil and Malayalam languages, achieving top-8 results in the HASOC shared task.

To tackle the conundrum of detecting offensive comments/posts which are considerably informal, unstructured, miswritten and code-mixed, we introduce two inventive methods in this research paper. Offensive comments/posts on the social media platforms, can affect an individual, a group or underage alike. In order to classify comments/posts in two popular Dravidian languages, Tamil and Malayalam, as a part of the HASOC - DravidianCodeMix FIRE 2021 shared task, we employ two Transformer-based prototypes which successfully stood in the top 8 for all the tasks. The codes for our approach can be viewed and utilized.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes