IRCLDec 26, 2023

Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages

arXiv:2312.16159v131 citationsh-index: 87Has CodeACL
Originality Synthesis-oriented
AI Analysis

This addresses the problem of information retrieval for low-resource language speakers, but it is incremental as it extends existing LLM reranking methods to new languages.

The study tackled the gap in evaluating large language models (LLMs) for zero-shot cross-lingual reranking in low-resource African languages, finding that cross-lingual reranking can be competitive with monolingual approaches depending on the LLM's multilingual capabilities.

Large language models (LLMs) have shown impressive zero-shot capabilities in various document reranking tasks. Despite their successful implementations, there is still a gap in existing literature on their effectiveness in low-resource languages. To address this gap, we investigate how LLMs function as rerankers in cross-lingual information retrieval (CLIR) systems for African languages. Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba) and we examine cross-lingual reranking with queries in English and passages in the African languages. Additionally, we analyze and compare the effectiveness of monolingual reranking using both query and document translations. We also evaluate the effectiveness of LLMs when leveraging their own generated translations. To get a grasp of the effectiveness of multiple LLMs, our study focuses on the proprietary models RankGPT-4 and RankGPT-3.5, along with the open-source model, RankZephyr. While reranking remains most effective in English, our results reveal that cross-lingual reranking may be competitive with reranking in African languages depending on the multilingual capability of the LLM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes