Fairness Evaluation of Large Language Models in Academic Library Reference Services
This addresses fairness concerns for library patrons when deploying LLMs in reference services, though the findings are incremental as they show limited bias in current models.
The study evaluated whether large language models (LLMs) exhibit bias when used in academic library reference services by testing six state-of-the-art models with prompts varying user demographics and institutional roles. It found no evidence of racial/ethnic bias, minor stereotypical bias against women in one model, and appropriate linguistic accommodations for institutional roles.
As libraries explore large language models (LLMs) for use in virtual reference services, a key question arises: Can LLMs serve all users equitably, regardless of demographics or social status? While they offer great potential for scalable support, LLMs may also reproduce societal biases embedded in their training data, risking the integrity of libraries' commitment to equitable service. To address this concern, we evaluate whether LLMs differentiate responses across user identities by prompting six state-of-the-art LLMs to assist patrons differing in sex, race/ethnicity, and institutional role. We found no evidence of differentiation by race or ethnicity, and only minor evidence of stereotypical bias against women in one model. LLMs demonstrated nuanced accommodation of institutional roles through the use of linguistic choices related to formality, politeness, and domain-specific vocabularies, reflecting professional norms rather than discriminatory treatment. These findings suggest that current LLMs show a promising degree of readiness to support equitable and contextually appropriate communication in academic library reference services.