CLNov 5, 2021

Developing Successful Shared Tasks on Offensive Language Identification for Dravidian Languages

Bharathi Raja Chakravarthi, Dhivya Chinnappa, Ruba Priyadharshini, Anand Kumar Madasamy, Sangeetha Sivanesan, Subalalitha Chinnaudayar Navaneethakrishnan, Sajeetha Thavareesan, Dhanalakshmi Vadivel, Rahul Ponnusamy, Prasanna Kumar Kumaresan

arXiv:2111.03375v10.212 citations

Originality Synthesis-oriented

AI Analysis

This addresses content moderation for social media in local languages, but is incremental as it builds on existing shared task methodologies.

The paper tackled offensive language identification in under-resourced Dravidian languages by developing shared tasks at conferences, providing a framework for comparing approaches.

With the fast growth of mobile computing and Web technologies, offensive language has become more prevalent on social networking platforms. Since offensive language identification in local languages is essential to moderate the social media content, in this paper we work with three Dravidian languages, namely Malayalam, Tamil, and Kannada, that are under-resourced. We present an evaluation task at FIRE 2020- HASOC-DravidianCodeMix and DravidianLangTech at EACL 2021, designed to provide a framework for comparing different approaches to this problem. This paper describes the data creation, defines the task, lists the participating systems, and discusses various methods.

View on arXiv PDF

Similar