Evaluating Machine Translation Quality with Conformal Predictive Distributions
This work addresses the need for reliable uncertainty estimation in machine translation, which is crucial for users and developers in natural language processing, though it appears incremental as it builds on existing conformal prediction techniques.
The paper tackles the problem of assessing uncertainty in machine translation by introducing a method that uses conformal predictive distributions to evaluate translation quality and provide reliable confidence scores, demonstrating that it outperforms a baseline on six language pairs in terms of coverage and sharpness.
This paper presents a new approach for assessing uncertainty in machine translation by simultaneously evaluating translation quality and providing a reliable confidence score. Our approach utilizes conformal predictive distributions to produce prediction intervals with guaranteed coverage, meaning that for any given significance level $ε$, we can expect the true quality score of a translation to fall out of the interval at a rate of $1-ε$. In this paper, we demonstrate how our method outperforms a simple, but effective baseline on six different language pairs in terms of coverage and sharpness. Furthermore, we validate that our approach requires the data exchangeability assumption to hold for optimal performance.