Uncovering the Limits of Text-based Emotion Detection
This work addresses emotion detection for applications like sentiment analysis, but it is incremental as it builds on existing datasets and methods.
The paper tackled the problem of text-based emotion detection by evaluating models on large datasets, finding that emotions expressed by writers are harder to identify than those perceived by readers, with novel models outperforming baselines on GoEmotions.
Identifying emotions from text is crucial for a variety of real world tasks. We consider the two largest now-available corpora for emotion classification: GoEmotions, with 58k messages labelled by readers, and Vent, with 33M writer-labelled messages. We design a benchmark and evaluate several feature spaces and learning algorithms, including two simple yet novel models on top of BERT that outperform previous strong baselines on GoEmotions. Through an experiment with human participants, we also analyze the differences between how writers express emotions and how readers perceive them. Our results suggest that emotions expressed by writers are harder to identify than emotions that readers perceive. We share a public web interface for researchers to explore our models.