Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale
This addresses the challenge of topic-independent quality assessment for claims in argumentation, which is incremental by building on existing methods with new data and tasks.
The paper tackles the problem of assessing claim quality in computational argumentation by comparing different revisions of the same claim, compiling a large-scale corpus with over 377k claim revision pairs and showing that learned indicators generalize well across topics.
Assessing the quality of arguments and of the claims the arguments are composed of has become a key task in computational argumentation. However, even if different claims share the same stance on the same topic, their assessment depends on the prior perception and weighting of the different aspects of the topic being discussed. This renders it difficult to learn topic-independent quality indicators. In this paper, we study claim quality assessment irrespective of discussed aspects by comparing different revisions of the same claim. We compile a large-scale corpus with over 377k claim revision pairs of various types from kialo.com, covering diverse topics from politics, ethics, entertainment, and others. We then propose two tasks: (a) assessing which claim of a revision pair is better, and (b) ranking all versions of a claim by quality. Our first experiments with embedding-based logistic regression and transformer-based neural networks show promising results, suggesting that learned indicators generalize well across topics. In a detailed error analysis, we give insights into what quality dimensions of claims can be assessed reliably. We provide the data and scripts needed to reproduce all results.