CLAug 17, 2021

A Weakly Supervised Dataset of Fine-Grained Emotions in Portuguese

Diogo Cortiz, Jefferson O. Silva, Newton Calegari, Ana Luísa Freitas, Ana Angélica Soares, Carolina Botelho, Gabriel Gaudencio Rêgo, Waldir Sampaio, Paulo Sergio Boggio

arXiv:2108.07638v21 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for emotion recognition resources in low-resource languages like Portuguese, though it is incremental as it applies existing weak supervision methods to a new domain.

The researchers tackled the problem of fine-grained emotion recognition in Portuguese by creating a lexical-based weakly supervised dataset, achieving an F1-score of 0.64 when fine-tuning a BERT model on a gold standard validation set.

Affective Computing is the study of how computers can recognize, interpret and simulate human affects. Sentiment Analysis is a common task inNLP related to this topic, but it focuses only on emotion valence (positive, negative, neutral). An emerging approach in NLP is Emotion Recognition, which relies on fined-grained classification. This research describes an approach to create a lexical-based weakly supervised corpus for fine-grained emotion in Portuguese. We evaluated our dataset by fine-tuning a transformer-based language model (BERT) and validating it on a Gold Standard annotated validation set. Our results (F1-score=.64) suggest lexical-based weak supervision as an appropriate strategy for initial work in low resourced environment.

View on arXiv PDF

Similar