A Corpus of Sentence-level Revisions in Academic Writing: A Step towards Understanding Statement Strength in Communication
This addresses a lack of data for studying statement strength, which is important for fields like media and academia, but is incremental as it focuses on data collection rather than novel methods.
The paper tackles the problem of understanding statement strength in communication by introducing a corpus of sentence-level revisions from academic writing, providing data to distinguish between strong and weak statements.
The strength with which a statement is made can have a significant impact on the audience. For example, international relations can be strained by how the media in one country describes an event in another; and papers can be rejected because they overstate or understate their findings. It is thus important to understand the effects of statement strength. A first step is to be able to distinguish between strong and weak statements. However, even this problem is understudied, partly due to a lack of data. Since strength is inherently relative, revisions of texts that make claims are a natural source of data on strength differences. In this paper, we introduce a corpus of sentence-level revisions from academic writing. We also describe insights gained from our annotation efforts for this task.