CLJul 10, 2018

Linguistic Characteristics of Censorable Language on SinaWeibo

Kei Yin Ng, Anna Feldman, Jing Peng, Chris Leberknight

arXiv:1807.03654v132.01094 citations

Originality Synthesis-oriented

AI Analysis

This addresses censorship detection for social media platforms, but it is incremental as it applies existing methods to a specific dataset.

The paper tackled the problem of predicting censorship on SinaWeibo by analyzing linguistic characteristics, finding that readability is the strongest indicator of censored content in their corpus.

This paper investigates censorship from a linguistic perspective. We collect a corpus of censored and uncensored posts on a number of topics, build a classifier that predicts censorship decisions independent of discussion topics. Our investigation reveals that the strongest linguistic indicator of censored content of our corpus is its readability.

View on arXiv PDF

Similar