"i have a feeling trump will win..................": Forecasting Winners and Losers from User Predictions on Twitter
This addresses the challenge of forecasting uncertain outcomes from social media data for applications in event prediction and reliability assessment, though it is incremental in leveraging existing NLP techniques.
The paper tackled the problem of predicting contest winners from explicit user predictions on Twitter, developing an automated method that aggregates these predictions and outperforms sentiment and tweet volume baselines across various tasks.
Social media users often make explicit predictions about upcoming events. Such statements vary in the degree of certainty the author expresses toward the outcome:"Leonardo DiCaprio will win Best Actor" vs. "Leonardo DiCaprio may win" or "No way Leonardo wins!". Can popular beliefs on social media predict who will win? To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. We then forecast uncertain outcomes using the wisdom of crowds, by aggregating users' explicit predictions. Our method for forecasting winners is fully automated, relying only on a set of contenders as input. It requires no training data of past outcomes and outperforms sentiment and tweet volume baselines on a broad range of contest prediction tasks. We further demonstrate how our approach can be used to measure the reliability of individual accounts' predictions and retrospectively identify surprise outcomes.