Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models
This work addresses the need for better understanding of discourse structures in language models, which is incremental as it builds on prior BERTology research.
The paper tackles the problem of analyzing discourse information in pre-trained and fine-tuned language models by proposing a novel method to infer discourse structures from long documents and assessing their accuracy and similarity to baselines, with results showing improved performance over existing approaches.
With a growing number of BERTology work analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and fine-tuned language models. We move beyond prior work along three dimensions: First, we describe a novel approach to infer discourse structures from arbitrarily long documents. Second, we propose a new type of analysis to explore where and how accurately intrinsic discourse is captured in the BERT and BART models. Finally, we assess how similar the generated structures are to a variety of baselines as well as their distribution within and between models.