Controversy in Context
This work addresses the need for better controversiality prediction in social applications of NLP and computational argumentation, offering a more nuanced dataset and improved accuracy.
The paper tackled the problem of predicting how controversial a concept is by using its immediate textual context, achieving state-of-the-art results with simple, language-independent machine-learning tools.
With the growing interest in social applications of Natural Language Processing and Computational Argumentation, a natural question is how controversial a given concept is. Prior works relied on Wikipedia's metadata and on content analysis of the articles pertaining to a concept in question. Here we show that the immediate textual context of a concept is strongly indicative of this property, and, using simple and language-independent machine-learning tools, we leverage this observation to achieve state-of-the-art results in controversiality prediction. In addition, we analyze and make available a new dataset of concepts labeled for controversiality. It is significantly larger than existing datasets, and grades concepts on a 0-10 scale, rather than treating controversiality as a binary label.