Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
This addresses the problem of generating task-specific natural language with minimal labeled data for NLP practitioners, representing a novel method for a known bottleneck rather than incremental.
The paper tackles compositional generalization in low-resource natural language generation by proposing VAE-DPRIOR, a variational autoencoder with novel ε-disentangled priors that separate content and label representations. The model achieves superior performance in data augmentation for zero/few-shot learning and few-shot text style transfer compared to competitive baselines.
In this paper, we propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for task-specific natural language generation with none or a handful of task-specific labeled examples. In order to tackle compositional generalization across tasks, our model performs disentangled representation learning by introducing a conditional prior for the latent content space and another conditional prior for the latent label space. Both types of priors satisfy a novel property called $ε$-disentangled. We show both empirically and theoretically that the novel priors can disentangle representations even without specific regularizations as in the prior work. The content prior enables directly sampling diverse content representations from the content space learned from the seen tasks, and fuse them with the representations of novel tasks for generating semantically diverse texts in the low-resource settings. Our extensive experiments demonstrate the superior performance of our model over competitive baselines in terms of i) data augmentation in continuous zero/few-shot learning, and ii) text style transfer in the few-shot setting.