CLJun 4, 2020

NewB: 200,000+ Sentences for Political Bias Detection

arXiv:2006.03051v20.31 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for more nuanced political bias analysis in natural language processing, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of detecting political bias in news by creating the NewB dataset with over 200,000 sentences from eleven sources, and they trained a recurrent neural network that achieved top-1, top-3, and top-5 accuracies of 33.3%, 61.4%, and 77.6%, significantly outperforming a baseline logistic regression model.

We present the Newspaper Bias Dataset (NewB), a text corpus of more than 200,000 sentences from eleven news sources regarding Donald Trump. While previous datasets have labeled sentences as either liberal or conservative, NewB covers the political views of eleven popular media sources, capturing more nuanced political viewpoints than a traditional binary classification system does. We train two state-of-the-art deep learning models to predict the news source of a given sentence from eleven newspapers and find that a recurrent neural network achieved top-1, top-3, and top-5 accuracies of 33.3%, 61.4%, and 77.6%, respectively, significantly outperforming a baseline logistic regression model's accuracies of 18.3%, 42.6%, and 60.8%. Using the news source label of sentences, we analyze the top n-grams with our model to gain meaningful insight into the portrayal of Trump by media sources.We hope that the public release of our dataset will encourage further research in using natural language processing to analyze more complex political biases. Our dataset is posted at https://github.com/JerryWeiAI/NewB .

View on arXiv PDF Code

Similar