Debunking Fake News One Feature at a Time
This work addresses automated fake news detection, which is an incremental improvement using hand-crafted features and ensemble methods.
The paper tackles fake news detection by developing a 2-stage ensemble model for stance detection between news articles and headlines, achieving a score of 78.63% on the Fake News Challenge dataset.
Identifying the stance of a news article body with respect to a certain headline is the first step to automated fake news detection. In this paper, we introduce a 2-stage ensemble model to solve the stance detection task. By using only hand-crafted features as input to a gradient boosting classifier, we are able to achieve a score of 9161.5 out of 11651.25 (78.63%) on the official Fake News Challenge (Stage 1) dataset. We identify the most useful features for detecting fake news and discuss how sampling techniques can be used to improve recall accuracy on a highly imbalanced dataset.