CLApr 26, 2020

Assessing Discourse Relations in Language Generation from GPT-2

arXiv:2004.12506v3993 citations
AI Analysis

This work addresses discourse coherence in language generation for NLP applications, but it is incremental as it builds on existing models and focuses on a specific linguistic aspect.

The study assessed GPT-2's ability to generate text with valid discourse relations, finding it often fails but improves with fine-tuning, and proposed a decoupled strategy to address these issues.

Recent advances in NLP have been attributed to the emergence of large-scale pre-trained language models. GPT-2, in particular, is suited for generation tasks given its left-to-right language modeling objective, yet the linguistic quality of its generated text has largely remain unexplored. Our work takes a step in understanding GPT-2's outputs in terms of discourse coherence. We perform a comprehensive study on the validity of explicit discourse relations in GPT-2's outputs under both organic generation and fine-tuned scenarios. Results show GPT-2 does not always generate text containing valid discourse relations; nevertheless, its text is more aligned with human expectation in the fine-tuned scenario. We propose a decoupled strategy to mitigate these problems and highlight the importance of explicitly modeling discourse information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes