CLApr 12, 2024

Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian

arXiv:2404.08488v11.91 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This addresses the problem of multilingual thematic analysis for researchers, but it is incremental as it extends existing English-focused methods to another language.

The paper tested whether large language models can perform thematic analysis on Italian text, showing that a pre-trained model produced themes with good resemblance to human-generated ones.

This paper proposes a test to perform Thematic Analysis (TA) with Large Language Model (LLM) on data which is in a different language than English. While there has been initial promising work on using pre-trained LLMs for TA on data in English, we lack any tests on whether these models can reasonably perform the same analysis with good quality in other language. In this paper a test will be proposed using an open access dataset of semi-structured interviews in Italian. The test shows that a pre-trained model can perform such a TA on the data, also using prompts in Italian. A comparative test shows the model capacity to produce themes which have a good resemblance with those produced independently by human researchers. The main implication of this study is that pre-trained LLMs may thus be suitable to support analysis in multilingual situations, so long as the language is supported by the model used.

View on arXiv PDF

Similar