Evaluation Methodology for Large Language Models for Multilingual Document Question and Answer
This work addresses the need for better evaluation methods in multilingual AI applications, but it appears incremental as it builds on existing translation-based approaches.
The paper tackled the problem of evaluating multilingual capabilities in Large Language Models for document question answering, finding that translating content into a high-resource language yields the best results.
With the widespread adoption of Large Language Models (LLMs), in this paper we investigate the multilingual capability of these models. Our preliminary results show that, translating the native language context, question and answer into a high resource language produced the best results.