CLOct 8, 2021

The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results

arXiv:2110.04392v1666 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This is the first shared task on explainable NLP evaluation metrics, addressing the need for interpretability in quality estimation for machine translation researchers and practitioners.

The paper introduced the Eval4NLP-2021 shared task, which tackled the problem of explainable quality estimation in machine translation by requiring systems to provide both sentence-level quality scores and word-level explanations for negative impacts, and presented data, guidelines, and results from six participating systems.

In this paper, we introduce the Eval4NLP-2021shared task on explainable quality estimation. Given a source-translation pair, this shared task requires not only to provide a sentence-level score indicating the overall quality of the translation, but also to explain this score by identifying the words that negatively impact translation quality. We present the data, annotation guidelines and evaluation setup of the shared task, describe the six participating systems, and analyze the results. To the best of our knowledge, this is the first shared task on explainable NLP evaluation metrics. Datasets and results are available at https://github.com/eval4nlp/SharedTask2021.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes