SICLIRLGNov 25, 2021

DiPD: Disruptive event Prediction Dataset from Twitter

arXiv:2111.15629v1
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for early warning systems to monitor and mitigate disruptive events, though it is incremental as it focuses on data collection rather than new methods.

The authors created a dataset called DiPD from Twitter to predict disruptive events like riots and protests, containing 263,561 labeled records for training classification systems.

Riots and protests, if gone out of control, can cause havoc in a country. We have seen examples of this, such as the BLM movement, climate strikes, CAA Movement, and many more, which caused disruption to a large extent. Our motive behind creating this dataset was to use it to develop machine learning systems that can give its users insight into the trending events going on and alert them about the events that could lead to disruption in the nation. If any event starts going out of control, it can be handled and mitigated by monitoring it before the matter escalates. This dataset collects tweets of past or ongoing events known to have caused disruption and labels these tweets as 1. We also collect tweets that are considered non-eventful and label them as 0 so that they can also be used to train a classification system. The dataset contains 94855 records of unique events and 168706 records of unique non-events, thus giving the total dataset 263561 records. We extract multiple features from the tweets, such as the user's follower count and the user's location, to understand the impact and reach of the tweets. This dataset might be useful in various event related machine learning problems such as event classification, event recognition, and so on.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes