GNLGMLJun 1, 2022

Predicting Political Ideology from Digital Footprints

arXiv:2206.00397v17 citationsh-index: 28Has Code
Originality Synthesis-oriented
AI Analysis

It addresses the challenge of inferring political preferences from online behavior for researchers and analysts, though it is incremental as it applies existing statistical learning methods to a new dataset.

This paper tackles the problem of predicting individual political ideology from digital footprints on Reddit, achieving up to 90.63% accuracy for economic ideology and 82.02% for social ideology using activity in non-political forums.

This paper proposes a new method to predict individual political ideology from digital footprints on one of the world's largest online discussion forum. We compiled a unique data set from the online discussion forum reddit that contains information on the political ideology of around 91,000 users as well as records of their comment frequency and the comments' text corpus in over 190,000 different subforums of interest. Applying a set of statistical learning approaches, we show that information about activity in non-political discussion forums alone, can very accurately predict a user's political ideology. Depending on the model, we are able to predict the economic dimension of ideology with an accuracy of up to 90.63% and the social dimension with and accuracy of up to 82.02%. In comparison, using the textual features from actual comments does not improve predictive accuracy. Our paper highlights the importance of revealed digital behaviour to complement stated preferences from digital communication when analysing human preferences and behaviour using online data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes