CLJun 7, 2021

BERTGEN: Multi-task Generation through BERT

Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

arXiv:2106.03484v131.5712 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses multimodal and multilingual generation for AI applications, but it is incremental as it builds on existing pre-trained models like BERT.

The authors tackled multimodal and multilingual language generation tasks by introducing BERTGEN, a decoder-only model that fuses VL-BERT and M-BERT, and showed it outperforms strong baselines in image captioning, machine translation, and multimodal machine translation, with competitive zero-shot performance.

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively. BERTGEN is auto-regressively trained for language generation tasks, namely image captioning, machine translation and multimodal machine translation, under a multitask setting. With a comprehensive set of evaluations, we show that BERTGEN outperforms many strong baselines across the tasks explored. We also show BERTGEN's ability for zero-shot language generation, where it exhibits competitive performance to supervised counterparts. Finally, we conduct ablation studies which demonstrate that BERTGEN substantially benefits from multi-tasking and effectively transfers relevant inductive biases from the pre-trained models.

View on arXiv PDF Code

Similar