CLJun 9, 2021

MICE: A Crosslinguistic Emotion Corpus in Malay, Indonesian, Chinese and English

arXiv:2106.04831v1
Originality Synthesis-oriented
AI Analysis

This work provides a foundational dataset for studying emotion expression across languages, though it is incremental as it builds on existing emotion word research.

The researchers compiled a crosslinguistic emotion corpus (MICE) containing thousands of emotion expressions in Malay, Indonesian, Mandarin Chinese, and English, and conducted a survey to categorize and rate these words based on valence and intensity.

MICE is a corpus of emotion words in four languages which is currently working progress. There are two sections to this study, Part I: Emotion word corpus and Part II: Emotion word survey. In Part 1, the method of how the emotion data is culled for each of the four languages will be described and very preliminary data will be presented. In total, we identified 3,750 emotion expressions in Malay, 6,657 in Indonesian, 3,347 in Mandarin Chinese and 8,683 in English. We are currently evaluating and double checking the corpus and doing further analysis on the distribution of these emotion expressions. Part II Emotion word survey involved an online language survey which collected information on how speakers assigned the emotion words into basic emotion categories, the rating for valence and intensity as well as biographical information of all the respondents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes