Towards Red Teaming in Multimodal and Multilingual Translation
It addresses the problem of overestimated performance in machine translation for researchers and developers, though it is incremental as it adapts an existing red teaming method to a new application area.
This paper tackles the challenge of evaluating machine translation models by introducing human-based red teaming to generate edge cases that cause critical errors, marking the first such study in this domain and providing recommendations for improving model reliability.
Assessing performance in Natural Language Processing is becoming increasingly complex. One particular challenge is the potential for evaluation datasets to overlap with training data, either directly or indirectly, which can lead to skewed results and overestimation of model performance. As a consequence, human evaluation is gaining increasing interest as a means to assess the performance and reliability of models. One such method is the red teaming approach, which aims to generate edge cases where a model will produce critical errors. While this methodology is becoming standard practice for generative AI, its application to the realm of conditional AI remains largely unexplored. This paper presents the first study on human-based red teaming for Machine Translation (MT), marking a significant step towards understanding and improving the performance of translation models. We delve into both human-based red teaming and a study on automation, reporting lessons learned and providing recommendations for both translation models and red teaming drills. This pioneering work opens up new avenues for research and development in the field of MT.