CL AI HCApr 17, 2023

Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-Yves Oudeyer

Microsoft

arXiv:2304.10548v115.8270 citationsh-index: 55

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of limited AI resources and generalizability for researchers in qualitative analysis, though it is incremental by applying existing LLMs to a specific domain.

The study tackled the labor-intensive process of qualitative analysis by using GPT-3 with pre-determined codebooks for deductive coding, achieving fair to substantial agreements with expert-coded results in a curiosity-driven questions task.

Qualitative analysis of textual contents unpacks rich and valuable information by assigning labels to the data. However, this process is often labor-intensive, particularly when working with large datasets. While recent AI-based tools demonstrate utility, researchers may not have readily available AI resources and expertise, let alone be challenged by the limited generalizability of those task-specific models. In this study, we explored the use of large language models (LLMs) in supporting deductive coding, a major category of qualitative analysis where researchers use pre-determined codebooks to label the data into a fixed set of codes. Instead of training task-specific models, a pre-trained LLM could be used directly for various tasks without fine-tuning through prompt learning. Using a curiosity-driven questions coding task as a case study, we found, by combining GPT-3 with expert-drafted codebooks, our proposed approach achieved fair to substantial agreements with expert-coded results. We lay out challenges and opportunities in using LLMs to support qualitative coding and beyond.

View on arXiv PDF

Similar