CLASJun 5, 2025

IIITH-BUT system for IWSLT 2025 low-resource Bhojpuri to Hindi speech translation

arXiv:2506.04714v12 citationsh-index: 20IWSLT
Originality Synthesis-oriented
AI Analysis

This work addresses speech translation for a low-resource language pair, which is incremental as it applies existing techniques to a new dataset.

The paper tackled low-resource Bhojpuri to Hindi speech translation by fine-tuning the SeamlessM4T model with hyperparameter optimization and data augmentation, resulting in significant performance improvements as measured by BLEU scores.

This paper presents the submission of IIITH-BUT to the IWSLT 2025 shared task on speech translation for the low-resource Bhojpuri-Hindi language pair. We explored the impact of hyperparameter optimisation and data augmentation techniques on the performance of the SeamlessM4T model fine-tuned for this specific task. We systematically investigated a range of hyperparameters including learning rate schedules, number of update steps, warm-up steps, label smoothing, and batch sizes; and report their effect on translation quality. To address data scarcity, we applied speed perturbation and SpecAugment and studied their effect on translation quality. We also examined the use of cross-lingual signal through joint training with Marathi and Bhojpuri speech data. Our experiments reveal that careful selection of hyperparameters and the application of simple yet effective augmentation techniques significantly improve performance in low-resource settings. We also analysed the translation hypotheses to understand various kinds of errors that impacted the translation quality in terms of BLEU.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes