A Semi-Supervised Approach for Low-Resourced Text Generation
This addresses the low-resource issue in text generation tasks, which is common but incremental in approach.
The paper tackles the problem of limited labeled data in text generation by using unlabeled data with denoising auto-encoders and language model-based reinforcement learning, resulting in significant improvements over basic models.
Recently, encoder-decoder neural models have achieved great success on text generation tasks. However, one problem of this kind of models is that their performances are usually limited by the scale of well-labeled data, which are very expensive to get. The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant. In this paper, we propose a method to make use of the unlabeled data to improve the performance of such models in the low-resourced circumstances. We use denoising auto-encoder (DAE) and language model (LM) based reinforcement learning (RL) to enhance the training of encoder and decoder with unlabeled data. Our method shows adaptability for different text generation tasks, and makes significant improvements over basic text generation models.