AraHealthQA 2025: The First Shared Task on Arabic Health Question Answering
This work tackles the problem of limited Arabic medical QA datasets for researchers and practitioners, but it is incremental as it builds on existing shared task frameworks by adapting them to a specific language and domain.
The paper introduced AraHealthQA 2025, the first shared task for Arabic health question answering, addressing the lack of high-quality Arabic medical QA resources by creating two tracks focused on mental health and broader medical domains, with results including participation statistics and baseline system outcomes.
We introduce AraHealthQA 2025, the Comprehensive Arabic Health Question Answering Shared Task, held in conjunction with ArabicNLP 2025 (co-located with EMNLP 2025). This shared task addresses the paucity of high-quality Arabic medical QA resources by offering two complementary tracks: MentalQA, focusing on Arabic mental health Q&A (e.g., anxiety, depression, stigma reduction), and MedArabiQ, covering broader medical domains such as internal medicine, pediatrics, and clinical decision making. Each track comprises multiple subtasks, evaluation datasets, and standardized metrics, facilitating fair benchmarking. The task was structured to promote modeling under realistic, multilingual, and culturally nuanced healthcare contexts. We outline the dataset creation, task design and evaluation framework, participation statistics, baseline systems, and summarize the overall outcomes. We conclude with reflections on the performance trends observed and prospects for future iterations in Arabic health QA.