CLCYJan 6, 2024

Reflections on Inductive Thematic Saturation as a potential metric for measuring the validity of an inductive Thematic Analysis with LLMs

arXiv:2401.03239v120 citationsh-index: 6Qual Quant
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of ensuring transactional validity in qualitative analysis for researchers using LLMs, though it is incremental as it builds on early explorations in this domain.

The paper tackles the challenge of validating inductive Thematic Analysis (TA) conducted with Large Language Models (LLMs) by proposing initial thematic saturation (ITS) as a metric, using a mathematical ratio of cumulative to unique code slopes to measure analytical saturation during initial coding on two datasets.

This paper presents a set of reflections on saturation and the use of Large Language Models (LLMs) for performing Thematic Analysis (TA). The paper suggests that initial thematic saturation (ITS) could be used as a metric to assess part of the transactional validity of TA with LLM, focusing on the initial coding. The paper presents the initial coding of two datasets of different sizes, and it reflects on how the LLM reaches some form of analytical saturation during the coding. The procedure proposed in this work leads to the creation of two codebooks, one comprising the total cumulative initial codes and the other the total unique codes. The paper proposes a metric to synthetically measure ITS using a simple mathematical calculation employing the ratio between slopes of cumulative codes and unique codes. The paper contributes to the initial body of work exploring how to perform qualitative analysis with LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes