CLIRLGMLFeb 22, 2015

Using NLP to measure democracy

arXiv:1502.06161v11 citations
Originality Incremental advance
AI Analysis

This provides a replicable and precise tool for political scientists and policymakers to assess democracy, though it is incremental as it applies existing NLP methods to a new domain.

The paper tackles the problem of measuring democracy by creating the first machine-coded democracy index using NLP, resulting in replicable scores with small standard errors that can distinguish between cases. It covers all independent countries from 1993-2012 based on 42 million news articles from 6,043 sources.

This paper uses natural language processing to create the first machine-coded democracy index, which I call Automated Democracy Scores (ADS). The ADS are based on 42 million news articles from 6,043 different sources and cover all independent countries in the 1993-2012 period. Unlike the democracy indices we have today the ADS are replicable and have standard errors small enough to actually distinguish between cases. The ADS are produced with supervised learning. Three approaches are tried: a) a combination of Latent Semantic Analysis and tree-based regression methods; b) a combination of Latent Dirichlet Allocation and tree-based regression methods; and c) the Wordscores algorithm. The Wordscores algorithm outperforms the alternatives, so it is the one on which the ADS are based. There is a web application where anyone can change the training set and see how the results change: democracy-scores.org

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes