Political Leaning and Politicalness Classification of Texts
This work addresses the challenge of poor generalization in political text classification for researchers and practitioners, but it is incremental as it builds on existing datasets and methods.
The paper tackled the problem of classifying text by political leaning and politicalness by compiling a diverse dataset from 12 existing datasets for leaning and extending 18 datasets for politicalness, and through benchmarking with leave-one-in and leave-one-out methods, it evaluated and trained models with improved generalization capabilities.
This paper addresses the challenge of automatically classifying text according to political leaning and politicalness using transformer models. We compose a comprehensive overview of existing datasets and models for these tasks, finding that current approaches create siloed solutions that perform poorly on out-of-distribution texts. To address this limitation, we compile a diverse dataset by combining 12 datasets for political leaning classification and creating a new dataset for politicalness by extending 18 existing datasets with the appropriate label. Through extensive benchmarking with leave-one-in and leave-one-out methodologies, we evaluate the performance of existing models and train new ones with enhanced generalization capabilities.