CL AISep 17, 2024

Task Arithmetic for Language Expansion in Speech Translation

Yao-Fei Cheng, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Wen Shen Teo, Siddhant Arora, Shinji Watanabe

arXiv:2409.11274v34.26 citationsh-index: 18

Originality Incremental advance

AI Analysis

This reduces the cost of expanding speech translation systems to new languages, though it is incremental as it builds on existing task arithmetic methods.

The paper tackles the costly problem of expanding language pairs in speech translation by proposing an augmented task arithmetic method that avoids re-training, achieving BLEU score improvements of up to 4.92 and COMET gains of up to 11.83 on benchmarks.

Recent progress in large language models (LLMs) has gained interest in speech-text multimodal foundation models, achieving strong performance on instruction-tuned speech translation (ST). However, expanding language pairs is costly due to re-training on combined new and previous datasets. To address this, we aim to build a one-to-many ST system from existing one-to-one ST systems using task arithmetic without re-training. Direct application of task arithmetic in ST leads to language confusion; therefore, we introduce an augmented task arithmetic method incorporating a language control model to ensure correct target language generation. Our experiments on MuST-C and CoVoST-2 show BLEU score improvements of up to 4.66 and 4.92, with COMET gains of 8.87 and 11.83. In addition, we demonstrate our framework can extend to language pairs lacking paired ST training data or pre-trained ST models by synthesizing ST models based on existing machine translation (MT) and ST models via task analogies.

View on arXiv PDF

Similar