CLOct 29, 2025

Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research

arXiv:2510.25432v11 citationsh-index: 1

Originality Synthesis-oriented

AI Analysis

This work addresses methodological issues for social science researchers using LLMs, but it is incremental as it builds on existing practices without introducing new technical methods.

The paper tackles challenges like interpretive bias and low reliability in using LLMs for qualitative social science research by introducing a framework based on interpretive depth and autonomy to classify applications and provide design recommendations, aiming to enhance transparency and reliability.

Large language models (LLMs) are increasingly utilized by researchers across a wide range of domains, and qualitative social science is no exception; however, this adoption faces persistent challenges, including interpretive bias, low reliability, and weak auditability. We introduce a framework that situates LLM usage along two dimensions, interpretive depth and autonomy, thereby offering a straightforward way to classify LLM applications in qualitative research and to derive practical design recommendations. We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science that use LLMs as a tool and not strictly as the subject of study. Rather than granting models expansive freedom, our approach encourages researchers to decompose tasks into manageable segments, much as they would when delegating work to capable undergraduate research assistants. By maintaining low levels of autonomy and selectively increasing interpretive depth only where warranted and under supervision, one can plausibly reap the benefits of LLMs while preserving transparency and reliability.

View on arXiv PDF

Similar