CLMay 12, 2024

Multilingual Power and Ideology Identification in the Parliament: a Reference Dataset and Simple Baselines

Çağrı Çöltekin, Matyáš Kopp, Katja Meden, Vaidas Morkevicius, Nikola Ljubešić, Tomaž Erjavec

arXiv:2405.07363v123.579 citationsh-index: 8PARLACLARIN

Originality Synthesis-oriented

AI Analysis

This work addresses the need for standardized data in political science and NLP for analyzing parliamentary discourse, but it is incremental as it builds on existing corpora and uses simple methods.

The authors tackled the problem of identifying political orientation and power positions from parliamentary speeches by introducing a multilingual dataset derived from ParlaMint, covering 29 parliaments, and provided baseline results using a simple classifier for left-to-right axis prediction and governing vs. opposition distinction.

We introduce a dataset on political orientation and power position identification. The dataset is derived from ParlaMint, a set of comparable corpora of transcribed parliamentary speeches from 29 national and regional parliaments. We introduce the dataset, provide the reasoning behind some of the choices during its creation, present statistics on the dataset, and, using a simple classifier, some baseline results on predicting political orientation on the left-to-right axis, and on power position identification, i.e., distinguishing between the speeches delivered by governing coalition party members from those of opposition party members.

View on arXiv PDF

Similar