Examining the rhetorical capacities of neural language models
This work addresses the gap in analyzing inter-sentential rhetorical knowledge in language models, which is incremental as it extends existing syntactic analyses to discourse-level features.
The paper tackles the problem of evaluating the rhetorical capacities of neural language models by proposing a method based on Rhetorical Structure Theory, finding that BERT-based models outperform others like GPT-2 and XLNet in encoding rhetorical knowledge.
Recently, neural language models (LMs) have demonstrated impressive abilities in generating high-quality discourse. While many recent papers have analyzed the syntactic aspects encoded in LMs, there has been no analysis to date of the inter-sentential, rhetorical knowledge. In this paper, we propose a method that quantitatively evaluates the rhetorical capacities of neural LMs. We examine the capacities of neural LMs understanding the rhetoric of discourse by evaluating their abilities to encode a set of linguistic features derived from Rhetorical Structure Theory (RST). Our experiments show that BERT-based LMs outperform other Transformer LMs, revealing the richer discourse knowledge in their intermediate layer representations. In addition, GPT-2 and XLNet apparently encode less rhetorical knowledge, and we suggest an explanation drawing from linguistic philosophy. Our method shows an avenue towards quantifying the rhetorical capacities of neural LMs.