Forecasting of Events by Tweet Data Mining
This work addresses event forecasting for social media analysts and planners, but it is incremental as it applies existing data mining theories to new event types without major methodological breakthroughs.
The paper tackled the problem of forecasting events by mining Twitter data, revealing that certain frequent sets and association rules can serve as predictive markers with support reaching global maxima before events, though with a time delay, and demonstrated this on the 2012 Olympic tennis final and Eurovision 2013 winner prediction.
This paper describes the analysis of quantitative characteristics of frequent sets and association rules in the posts of Twitter microblogs related to different event discussions. For the analysis, we used a theory of frequent sets, association rules and a theory of formal concept analysis. We revealed the frequent sets and association rules which characterize the semantic relations between the concepts of analyzed subjects. The support of some frequent sets reaches its global maximum before the expected event but with some time delay. Such frequent sets may be considered as predictive markers that characterize the significance of expected events for blogosphere users. We showed that the time dynamics of confidence in some revealed association rules can also have predictive characteristics. Exceeding a certain threshold may be a signal for corresponding reaction in the society within the time interval between the maximum and the probable coming of an event. In this paper, we considered two types of events: the Olympic tennis tournament final in London, 2012 and the prediction of Eurovision 2013 winner.