CLJan 16

Bridging Human Interpretation and Machine Representation: A Landscape of Qualitative Data Analysis in the LLM Era

arXiv:2601.11739v1h-index: 7
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of inconsistent and limited LLM outputs in qualitative research for social scientists and AI developers, though it is incremental as it builds on existing critiques without introducing new methods.

The paper identifies a gap in LLM-based qualitative data analysis, where current systems focus on low-level meaning and simple representations, lacking reliable interpretive or theoretical inference. It proposes a framework to categorize these systems and outlines an agenda for developing more explicit and governable LLM tools.

LLMs are increasingly used to support qualitative research, yet existing systems produce outputs that vary widely--from trace-faithful summaries to theory-mediated explanations and system models. To make these differences explicit, we introduce a 4$\times$4 landscape crossing four levels of meaning-making (descriptive, categorical, interpretive, theoretical) with four levels of modeling (static structure, stages/timelines, causal pathways, feedback dynamics). Applying the landscape to prior LLM-based automation highlights a strong skew toward low-level meaning and low-commitment representations, with few reliable attempts at interpretive/theoretical inference or dynamical modeling. Based on the revealed gap, we outline an agenda for applying and building LLM-systems that make their interpretive and modeling commitments explicit, selectable, and governable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes