CLMar 21, 2024

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

arXiv:2403.14118v219 citationsh-index: 15IJCNN
Originality Synthesis-oriented
AI Analysis

It synthesizes existing research for practitioners and researchers in machine translation, but is incremental as it does not introduce new methods or results.

This survey paper provides a comprehensive overview of Machine Translation Quality Estimation (MTQE), covering its datasets, annotation methods, shared tasks, methodologies, challenges, and future directions, categorizing methods from handcrafted features to deep learning and Large Language Models.

Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodologies, challenges, and future research directions. It begins with an introduction to the background and significance of QE, followed by an explanation of the concepts and evaluation metrics for word-level QE, sentence-level QE, document-level QE, and explainable QE. The paper categorizes the methods developed throughout the history of QE into those based on handcrafted features, deep learning, and Large Language Models (LLMs), with a further division of deep learning-based methods into classic deep learning and those incorporating pre-trained language models (LMs). Additionally, the article details the advantages and limitations of each method and offers a straightforward comparison of different approaches. Finally, the paper discusses the current challenges in QE research and provides an outlook on future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes