The Rise and Fall of Fake News sites: A Traffic Analysis
This addresses the challenge of misinformation online by providing insights into fake news website dynamics, though it is incremental as it builds on prior studies of diffusion and detection.
The paper tackles the problem of characterizing the operational behavior of fake news websites, such as their lifespan, traffic, and third-party support, compared to real news sites, and builds a content-agnostic ML classifier for detection with unspecified accuracy.
Over the past decade, we have witnessed the rise of misinformation on the Internet, with online users constantly falling victims of fake news. A multitude of past studies have analyzed fake news diffusion mechanics and detection and mitigation techniques. However, there are still open questions about their operational behavior such as: How old are fake news websites? Do they typically stay online for long periods of time? Do such websites synchronize with each other their up and down time? Do they share similar content through time? Which third-parties support their operations? How much user traffic do they attract, in comparison to mainstream or real news websites? In this paper, we perform a first of its kind investigation to answer such questions regarding the online presence of fake news websites and characterize their behavior in comparison to real news websites. Based on our findings, we build a content-agnostic ML classifier for automatic detection of fake news websites (i.e. accuracy) that are not yet included in manually curated blacklists.