Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text
This addresses the challenge of moderating harmful content on social media for users in Dravidian language communities, but it is incremental as it applies existing Transformer methods to a specific dataset.
The paper tackled the problem of detecting offensive text in informal, code-mixed social media content in Tamil and Malayalam languages, achieving top-8 results in the HASOC shared task.
To tackle the conundrum of detecting offensive comments/posts which are considerably informal, unstructured, miswritten and code-mixed, we introduce two inventive methods in this research paper. Offensive comments/posts on the social media platforms, can affect an individual, a group or underage alike. In order to classify comments/posts in two popular Dravidian languages, Tamil and Malayalam, as a part of the HASOC - DravidianCodeMix FIRE 2021 shared task, we employ two Transformer-based prototypes which successfully stood in the top 8 for all the tasks. The codes for our approach can be viewed and utilized.