CLFeb 18, 2017

A Stylometric Inquiry into Hyperpartisan and Fake News

Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, Benno Stein

arXiv:1702.05638v119.0682 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of identifying hyperpartisan and fake news for media analysts and fact-checkers, though it is incremental in applying existing methods to a new dataset.

The paper tackled the problem of analyzing writing styles in hyperpartisan and fake news by creating a manually fact-checked corpus of 1,627 articles, revealing that 97% of fake news came from hyperpartisan sources and showing style similarities between left-wing and right-wing news. It demonstrated that hyperpartisan news can be discriminated from mainstream with an F1 score of 0.78, but style-based fake news detection performed poorly with an F1 of 0.46.

This paper reports on a writing style analysis of hyperpartisan (i.e., extremely one-sided) news in connection to fake news. It presents a large corpus of 1,627 articles that were manually fact-checked by professional journalists from BuzzFeed. The articles originated from 9 well-known political publishers, 3 each from the mainstream, the hyperpartisan left-wing, and the hyperpartisan right-wing. In sum, the corpus contains 299 fake news, 97% of which originated from hyperpartisan publishers. We propose and demonstrate a new way of assessing style similarity between text categories via Unmasking---a meta-learning approach originally devised for authorship verification---, revealing that the style of left-wing and right-wing news have a lot more in common than any of the two have with the mainstream. Furthermore, we show that hyperpartisan news can be discriminated well by its style from the mainstream (F1=0.78), as can be satire from both (F1=0.81). Unsurprisingly, style-based fake news detection does not live up to scratch (F1=0.46). Nevertheless, the former results are important to implement pre-screening for fake news detectors.

View on arXiv PDF Code

Similar