Manipulating emotions for ground truth emotion analysis
This addresses the issue of unknown validity in emotion analysis for computational social scientists, though it is incremental by applying existing experimental methods to text data.
The paper tackled the problem of validating emotion analysis in text by using online emotion induction techniques to create ground truth data, finding that text-based measurements captured only up to one-third of the variance in induced emotions and pretrained classifiers performed poorly.
Text data are being used as a lens through which human cognition can be studied at a large scale. Methods like emotion analysis are now in the standard toolkit of computational social scientists but typically rely on third-person annotation with unknown validity. As an alternative, this paper introduces online emotion induction techniques from experimental behavioural research as a method for text-based emotion analysis. Text data were collected from participants who were randomly allocated to a happy, neutral or sad condition. The findings support the mood induction procedure. We then examined how well lexicon approaches can retrieve the induced emotion. All approaches resulted in statistical differences between the true emotion conditions. Overall, only up to one-third of the variance in emotion was captured by text-based measurements. Pretrained classifiers performed poorly on detecting true emotions. The paper concludes with limitations and suggestions for future research.