HC CLOct 15, 2024

Human-LLM Collaborative Construction of a Cantonese Emotion Lexicon

Yusong Zhang, Dong Dong, Chi-tim Hung, Leonard Heyerdahl, Tamara Giles-Vernick, Eng-kiong Yeoh

arXiv:2410.11526v11 citationsh-index: 45

Originality Incremental advance

AI Analysis

This work addresses the need for linguistic resources in low-resource languages like Cantonese, though it is incremental as it builds on existing methods for automated annotation.

The study tackled the problem of creating an emotion lexicon for Cantonese, a low-resource language, by combining LLM and human annotations, resulting in a lexicon enriched with colloquial expressions and validated on three emotion text datasets.

Large Language Models (LLMs) have demonstrated remarkable capabilities in language understanding and generation. Advanced utilization of the knowledge embedded in LLMs for automated annotation has consistently been explored. This study proposed to develop an emotion lexicon for Cantonese, a low-resource language, through collaborative efforts between LLM and human annotators. By integrating emotion labels provided by LLM and human annotators, the study leveraged existing linguistic resources including lexicons in other languages and local forums to construct a Cantonese emotion lexicon enriched with colloquial expressions. The consistency of the proposed emotion lexicon in emotion extraction was assessed through modification and utilization of three distinct emotion text datasets. This study not only validates the efficacy of the constructed lexicon but also emphasizes that collaborative annotation between human and artificial intelligence can significantly enhance the quality of emotion labels, highlighting the potential of such partnerships in facilitating natural language processing tasks for low-resource languages.

View on arXiv PDF

Similar