SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval
This work addresses the problem of inconsistent model performance across tasks and languages for NLP practitioners, but it is incremental as it primarily evaluates existing methods without introducing new techniques.
The paper tackled multilingual question answering and named entity recognition across five languages by testing five large language models with various prompting methods, finding that model effectiveness varied by task and language, with advanced prompting improving QA but having mixed results for NER.
This paper explores the problems of Question Answering (QA) and Named Entity Recognition (NER) in five diverse languages. We tested five Large Language Models with various prompting methods, including zero-shot, chain-of-thought reasoning, and translation techniques. Our results show that while some models consistently outperform others, their effectiveness varies significantly across tasks and languages. We saw that advanced prompting techniques generally improved QA performance but had mixed results for NER; and we observed that language difficulty patterns differed between tasks. Our findings highlight the need for task-specific approaches in multilingual NLP and suggest that current models may develop different linguistic competencies for different tasks.