How Will Your Tweet Be Received? Predicting the Sentiment Polarity of Tweet Replies
This work addresses sentiment analysis for social media users and researchers, but it is incremental as it adapts existing methods to a new task.
The paper tackles the problem of predicting the predominant sentiment in replies to tweets by introducing the RETWEET dataset and a two-stage deep learning method that uses automatically labeled data, achieving promising results without manual annotations.
Twitter sentiment analysis, which often focuses on predicting the polarity of tweets, has attracted increasing attention over the last years, in particular with the rise of deep learning (DL). In this paper, we propose a new task: predicting the predominant sentiment among (first-order) replies to a given tweet. Therefore, we created RETWEET, a large dataset of tweets and replies manually annotated with sentiment labels. As a strong baseline, we propose a two-stage DL-based method: first, we create automatically labeled training data by applying a standard sentiment classifier to tweet replies and aggregating its predictions for each original tweet; our rationale is that individual errors made by the classifier are likely to cancel out in the aggregation step. Second, we use the automatically labeled data for supervised training of a neural network to predict reply sentiment from the original tweets. The resulting classifier is evaluated on the new RETWEET dataset, showing promising results, especially considering that it has been trained without any manually labeled data. Both the dataset and the baseline implementation are publicly available.