A Data Set of Internet Claims and Comparison of their Sentiments with Credibility
This work addresses the societal issue of fake news proliferation, but it is incremental as it focuses on data set creation and preliminary analysis without major methodological breakthroughs.
The paper tackles the problem of misinformation by creating a data set from fact-checking sources to study the relationship between claim sentiment and credibility, aiming to enable predictive modeling and understand the spread of fake news.
In this modern era, communication has become faster and easier. This means fallacious information can spread as fast as reality. Considering the damage that fake news kindles on the psychology of people and the fact that such news proliferates faster than truth, we need to study the phenomenon that helps spread fake news. An unbiased data set that depends on reality for rating news is necessary to construct predictive models for its classification. This paper describes the methodology to create such a data set. We collect our data from snopes.com which is a fact-checking organization. Furthermore, we intend to create this data set not only for classification of the news but also to find patterns that reason the intent behind misinformation. We also formally define an Internet Claim, its credibility, and the sentiment behind such a claim. We try to realize the relationship between the sentiment of a claim with its credibility. This relationship pours light on the bigger picture behind the propagation of misinformation. We pave the way for further research based on the methodology described in this paper to create the data set and usage of predictive modeling along with research-based on psychology/mentality of people to understand why fake news spreads much faster than reality.