CLMay 12, 2022

Findings of the Shared Task on Offensive Span Identification from Code-Mixed Tamil-English Comments

Manikandan Ravikiran, Bharathi Raja Chakravarthi, Anand Kumar Madasamy, Sangeetha Sivanesan, Ratnavel Rajalakshmi, Sajeetha Thavareesan, Rahul Ponnusamy, Shankar Mahadevan

arXiv:2205.06118v11.454 citationsh-index: 44

Originality Synthesis-oriented

AI Analysis

This work addresses offensive content moderation for Tamil-English code-mixed social media, but it is incremental as it builds on existing classification tasks by providing span-level annotations.

The paper tackled the problem of identifying offensive spans in Tamil-English code-mixed social media comments by releasing an annotated dataset, and the results showed that systems achieved performance metrics such as F1-scores, with top submissions reaching around 0.75.

Offensive content moderation is vital in social media platforms to support healthy online discussions. However, their prevalence in codemixed Dravidian languages is limited to classifying whole comments without identifying part of it contributing to offensiveness. Such limitation is primarily due to the lack of annotated data for offensive spans. Accordingly, in this shared task, we provide Tamil-English code-mixed social comments with offensive spans. This paper outlines the dataset so released, methods, and results of the submitted systems

View on arXiv PDF

Similar