Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
This work addresses the challenge of literary translation in machine translation for researchers and industry, but it is incremental as it builds on existing shared tasks by focusing on discourse-level aspects.
The paper introduced the WMT 2023 shared task on discourse-level literary translation, releasing a copyrighted Chinese-English web novel corpus and an industry-endorsed evaluation criteria, with 14 submissions from 7 teams evaluated using automatic and human methods, resulting in a ranking based on human judgments and analysis of findings.
Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation. First, we (Tencent AI Lab and China Literature Ltd.) release a copyrighted and document-level Chinese-English web novel corpus. Furthermore, we put forth an industry-endorsed criteria to guide human evaluation process. This year, we totally received 14 submissions from 7 academia and industry teams. We employ both automatic and human evaluations to measure the performance of the submitted systems. The official ranking of the systems is based on the overall human judgments. In addition, our extensive analysis reveals a series of interesting findings on literary and discourse-aware MT. We release data, system outputs, and leaderboard at http://www2.statmt.org/wmt23/literary-translation-task.html.