Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs
This work addresses culturally-nuanced machine translation for English-Korean pairs, exposing gaps in automatic evaluation metrics, but it is incremental as it focuses on benchmarking existing models.
The study evaluated 13 models for English-Korean translation, finding that LLMs outperform traditional MT systems but struggle with entity translation requiring cultural adaptation, with performance varying by entity type and popularity.
Translating knowledge-intensive and entity-rich text between English and Korean requires transcreation to preserve language-specific and cultural nuances beyond literal, phonetic or word-for-word conversion. We evaluate 13 models (LLMs and MT models) using automatic metrics and human assessment by bilingual annotators. Our findings show LLMs outperform traditional MT systems but struggle with entity translation requiring cultural adaptation. By constructing an error taxonomy, we identify incorrect responses and entity name errors as key issues, with performance varying by entity type and popularity level. This work exposes gaps in automatic evaluation metrics and hope to enable future work in completing culturally-nuanced machine translation.