CLApr 1, 2024

AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness

Miaoran Zhang, Mingyang Wang, Jesujoba O. Alabi, Dietrich Klakow

arXiv:2404.01490v215.227 citationsh-index: 35Has CodeSemEval

Originality Incremental advance

AI Analysis

This work addresses the challenge of low-resource languages in semantic textual relatedness, but it is incremental as it builds on existing methods like data augmentation and adapters for a specific shared task.

The paper tackled the problem of measuring semantic textual relatedness for under-represented languages by using machine translation for data augmentation and task-adaptive pre-training, achieving the best performance among all ranked teams in supervised learning and cross-lingual transfer subtasks at SemEval-2024 Task 1.

This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages. The shared task aims at measuring the semantic textual relatedness between pairs of sentences, with a focus on a range of under-represented languages. In this work, we propose using machine translation for data augmentation to address the low-resource challenge of limited training data. Moreover, we apply task-adaptive pre-training on unlabeled task data to bridge the gap between pre-training and task adaptation. For model training, we investigate both full fine-tuning and adapter-based tuning, and adopt the adapter framework for effective zero-shot cross-lingual transfer. We achieve competitive results in the shared task: our system performs the best among all ranked teams in both subtask A (supervised learning) and subtask C (cross-lingual transfer).

View on arXiv PDF Code

Similar