CLOct 25, 2023

OccuQuest: Mitigating Occupational Bias for Inclusive Large Language Models

Mingfeng Xue, Dayiheng Liu, Kexin Yang, Guanting Dong, Wenqiang Lei, Zheng Yuan, Chang Zhou, Jingren Zhou

Tsinghua

arXiv:2310.16517v11.34 citationsh-index: 27

Originality Incremental advance

AI Analysis

This addresses bias in LLMs for practitioners in underrepresented fields, though it is incremental as it builds on existing instruction-tuning methods.

The paper tackles occupational bias in large language models by creating OccuQuest, a dataset with over 110,000 prompt-completion pairs covering 1,000+ occupations, and fine-tuning LLaMA to produce OccuLLaMA, which achieves an 86.4% win rate against WizardLM on real-world professional questions.

The emergence of large language models (LLMs) has revolutionized natural language processing tasks. However, existing instruction-tuning datasets suffer from occupational bias: the majority of data relates to only a few occupations, which hampers the instruction-tuned LLMs to generate helpful responses to professional queries from practitioners in specific fields. To mitigate this issue and promote occupation-inclusive LLMs, we create an instruction-tuning dataset named \emph{OccuQuest}, which contains 110,000+ prompt-completion pairs and 30,000+ dialogues covering over 1,000 occupations in 26 occupational categories. We systematically request ChatGPT, organizing queries hierarchically based on Occupation, Responsibility, Topic, and Question, to ensure a comprehensive coverage of occupational specialty inquiries. By comparing with three commonly used datasets (Dolly, ShareGPT, and WizardLM), we observe that OccuQuest exhibits a more balanced distribution across occupations. Furthermore, we assemble three test sets for comprehensive evaluation, an occu-test set covering 25 occupational categories, an estate set focusing on real estate, and an occu-quora set containing real-world questions from Quora. We then fine-tune LLaMA on OccuQuest to obtain OccuLLaMA, which significantly outperforms state-of-the-art LLaMA variants (Vicuna, Tulu, and WizardLM) on professional questions in GPT-4 and human evaluations. Notably, on the occu-quora set, OccuLLaMA reaches a high win rate of 86.4\% against WizardLM.

View on arXiv PDF

Similar