Generating Sentences from Disentangled Syntactic and Semantic Spaces
This work addresses a bottleneck in natural language generation for researchers and practitioners by improving syntactic control, though it is incremental as it builds on existing VAE frameworks.
The paper tackles the problem of generating sentences without explicit syntactic modeling in variational auto-encoders by proposing a method that disentangles syntactic and semantic spaces, leading to better performance and enabling novel applications like unsupervised paraphrase generation.
Variational auto-encoders (VAEs) are widely used in natural language generation due to the regularization of the latent space. However, generating sentences from the continuous latent space does not explicitly model the syntactic information. In this paper, we propose to generate sentences from disentangled syntactic and semantic spaces. Our proposed method explicitly models syntactic information in the VAE's latent space by using the linearized tree sequence, leading to better performance of language generation. Additionally, the advantage of sampling in the disentangled syntactic and semantic latent spaces enables us to perform novel applications, such as the unsupervised paraphrase generation and syntax-transfer generation. Experimental results show that our proposed model achieves similar or better performance in various tasks, compared with state-of-the-art related work.