Bilingual-GAN: A Step Towards Parallel Text Generation
This work addresses the challenge of parallel text generation for machine translation, but it is incremental as it builds on existing GAN and attention-based methods.
The paper tackles the problem of generating parallel sentences in two languages concurrently and performing bidirectional translation by proposing an adversarial latent space model that combines GANs and sequence-to-sequence approaches. The model achieves competitive performance on English-French translation tasks using Europarl and Multi30k datasets, as documented in supervised and unsupervised settings.
Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and unsupervised machine translation respectively. Leveraging the two domains, we propose an adversarial latent space based model capable of generating parallel sentences in two languages concurrently and translating bidirectionally. The bilingual generation goal is achieved by sampling from the latent space that is shared between both languages. First two denoising autoencoders are trained, with shared encoders and back-translation to enforce a shared latent state between the two languages. The decoder is shared for the two translation directions. Next, a GAN is trained to generate synthetic "code" mimicking the languages' shared latent space. This code is then fed into the decoder to generate text in either language. We perform our experiments on Europarl and Multi30k datasets, on the English-French language pair, and document our performance using both supervised and unsupervised machine translation.