CL AI LGNov 11, 2024

1-800-SHARED-TASKS @ NLU of Devanagari Script Languages: Detection of Language, Hate Speech, and Targets using LLMs

Jebish Purbey, Siddartha Pullakhandam, Kanwal Mehreen, Muhammad Arham, Drishti Sharma, Ashay Srivastava, Ram Mohan Rao Kadiyala

arXiv:2411.06850v11.02 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This work addresses natural language understanding challenges for Devanagari script languages, but it is incremental as it applies existing methods to a shared task.

The paper tackled language detection, hate speech identification, and target detection in Devanagari script languages using large language models and ensembles, achieving F1 scores of 0.9980, 0.7652, and 0.6804 across the tasks.

This paper presents a detailed system description of our entry for the CHiPSAL 2025 shared task, focusing on language detection, hate speech identification, and target detection in Devanagari script languages. We experimented with a combination of large language models and their ensembles, including MuRIL, IndicBERT, and Gemma-2, and leveraged unique techniques like focal loss to address challenges in the natural understanding of Devanagari languages, such as multilingual processing and class imbalance. Our approach achieved competitive results across all tasks: F1 of 0.9980, 0.7652, and 0.6804 for Sub-tasks A, B, and C respectively. This work provides insights into the effectiveness of transformer models in tasks with domain-specific and linguistic challenges, as well as areas for potential improvement in future iterations.

View on arXiv PDF

Similar