CLMay 8, 2022

Scheduled Multi-task Learning for Neural Chat Translation

Tsinghua
arXiv:2205.03766v2643 citationsh-index: 49
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving chat translation quality for conversational text, though it appears incremental by building on existing multi-task learning approaches.

The paper tackles the problem of insufficient data and simplistic training in Neural Chat Translation by proposing a scheduled multi-task learning framework with a three-stage training process, achieving verified effectiveness in experiments across four language directions.

Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to effectively enhance the main chat translation task. Extensive experiments in four language directions (English-Chinese and English-German) verify the effectiveness and superiority of the proposed approach. Additionally, we have made the large-scale in-domain paired bilingual dialogue dataset publicly available to the research community.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes