CL IRAug 9, 2025

Two-Stage Quranic QA via Ensemble Retrieval and Instruction-Tuned Answer Extraction

Mohamed Basem, Islam Oshallah, Ali Hamdi, Khaled Shaban, Hozaifa Kassab

arXiv:2508.06971v24.91 citationsh-index: 3AICCSA

Originality Incremental advance

AI Analysis

This addresses the challenge of low-resource question answering in specialized domains like religious texts, but it is incremental as it builds on existing methods with ensembling and instruction-tuning.

The paper tackled the problem of Quranic Question Answering by proposing a two-stage framework for passage retrieval and answer extraction, achieving state-of-the-art results with a MAP@10 of 0.3128, MRR@10 of 0.5763 for retrieval, and pAP@10 of 0.669 for extraction.

Quranic Question Answering presents unique challenges due to the linguistic complexity of Classical Arabic and the semantic richness of religious texts. In this paper, we propose a novel two-stage framework that addresses both passage retrieval and answer extraction. For passage retrieval, we ensemble fine-tuned Arabic language models to achieve superior ranking performance. For answer extraction, we employ instruction-tuned large language models with few-shot prompting to overcome the limitations of fine-tuning on small datasets. Our approach achieves state-of-the-art results on the Quran QA 2023 Shared Task, with a MAP@10 of 0.3128 and MRR@10 of 0.5763 for retrieval, and a pAP@10 of 0.669 for extraction, substantially outperforming previous methods. These results demonstrate that combining model ensembling and instruction-tuned language models effectively addresses the challenges of low-resource question answering in specialized domains.

View on arXiv PDF

Similar