CLOct 20, 2025

Disparities in Multilingual LLM-Based Healthcare Q&A

arXiv:2510.17476v13 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses equitable access to reliable health information in multilingual AI systems for healthcare, though it is incremental as it builds on existing methods like RAG and dataset construction.

The study tackled cross-lingual disparities in healthcare Q&A by analyzing Wikipedia coverage and LLM factual alignment across five languages, finding substantial disparities where responses aligned more with English sources even for non-English prompts, and showing that using contextual excerpts from non-English Wikipedia at inference shifts alignment toward culturally relevant knowledge.

Equitable access to reliable health information is vital when integrating AI into healthcare. Yet, information quality varies across languages, raising concerns about the reliability and consistency of multilingual Large Language Models (LLMs). We systematically examine cross-lingual disparities in pre-training source and factuality alignment in LLM answers for multilingual healthcare Q&A across English, German, Turkish, Chinese (Mandarin), and Italian. We (i) constructed Multilingual Wiki Health Care (MultiWikiHealthCare), a multilingual dataset from Wikipedia; (ii) analyzed cross-lingual healthcare coverage; (iii) assessed LLM response alignment with these references; and (iv) conducted a case study on factual alignment through the use of contextual information and Retrieval-Augmented Generation (RAG). Our findings reveal substantial cross-lingual disparities in both Wikipedia coverage and LLM factual alignment. Across LLMs, responses align more with English Wikipedia, even when the prompts are non-English. Providing contextual excerpts from non-English Wikipedia at inference time effectively shifts factual alignment toward culturally relevant knowledge. These results highlight practical pathways for building more equitable, multilingual AI systems for healthcare.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes