Facts2Story: Controlling Text Generation by Key Facts
This work addresses the problem of content control in text generation for researchers and developers working on controllable text generation systems.
This paper tackles the problem of controlling text generation by expanding a sequence of natural language facts into a longer narrative. They found that while auto-regressive models like GPT2 produce better fluency, they struggle to adhere to facts. Their proposed plan-and-cloze model (using fine-tuned XLNet) achieves competitive fluency while adhering to the requested content.
Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation. Yet, while current methods are capable of producing a coherent text which is several hundred words long, attaining control over the content that is being generated -- as well as evaluating it -- are still open questions. We propose a controlled generation task which is based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We introduce human-based evaluation metrics for this task, as well as a method for deriving a large training dataset. We evaluate three methods on this task, based on fine-tuning pre-trained models. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.