Sequentially Controlled Text Generation
This addresses the issue of unstructured long documents in text generation for applications requiring coherent writing, though it is incremental as it builds on existing generation methods with a new control task.
The paper tackles the problem of imposing structure on long-range text generation to prevent rambling, proposing a sequentially controlled text generation pipeline and showing that increased structural awareness improves control-accuracy, grammaticality, coherency, and topicality, approaching human-level performance.
While GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure. We study the problem of imposing structure on long-range text. We propose a novel controlled text generation task, sequentially controlled text generation, and identify a dataset, NewsDiscourse as a starting point for this task. We develop a sequential controlled text generation pipeline with generation and editing. We test different degrees of structural awareness and show that, in general, more structural awareness results in higher control-accuracy, grammaticality, coherency and topicality, approaching human-level writing performance.