CLAIJan 29, 2024

MultiMUC: Multilingual Template Filling on MUC-4

arXiv:2401.16209v1107 citationsh-index: 60EACL
Originality Synthesis-oriented
AI Analysis

This provides a new benchmark for multilingual template filling, but it is incremental as it extends an existing dataset.

The authors tackled the lack of multilingual resources for template filling by creating MultiMUC, a parallel corpus translated into five languages, and reported baseline results using state-of-the-art models and ChatGPT.

We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all languages, we also provide human translations for sentences in the dev and test splits that contain annotated template arguments. Finally, we present baselines on MultiMUC both with state-of-the-art template filling models and with ChatGPT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes