CLAIHCApr 17, 2023

Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

Microsoft
arXiv:2304.10548v1270 citationsh-index: 55
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of limited AI resources and generalizability for researchers in qualitative analysis, though it is incremental by applying existing LLMs to a specific domain.

The study tackled the labor-intensive process of qualitative analysis by using GPT-3 with pre-determined codebooks for deductive coding, achieving fair to substantial agreements with expert-coded results in a curiosity-driven questions task.

Qualitative analysis of textual contents unpacks rich and valuable information by assigning labels to the data. However, this process is often labor-intensive, particularly when working with large datasets. While recent AI-based tools demonstrate utility, researchers may not have readily available AI resources and expertise, let alone be challenged by the limited generalizability of those task-specific models. In this study, we explored the use of large language models (LLMs) in supporting deductive coding, a major category of qualitative analysis where researchers use pre-determined codebooks to label the data into a fixed set of codes. Instead of training task-specific models, a pre-trained LLM could be used directly for various tasks without fine-tuning through prompt learning. Using a curiosity-driven questions coding task as a case study, we found, by combining GPT-3 with expert-drafted codebooks, our proposed approach achieved fair to substantial agreements with expert-coded results. We lay out challenges and opportunities in using LLMs to support qualitative coding and beyond.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes