Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media
This work addresses the gap in applying LLMs to qualitative analysis in humanities and social sciences, though it is incremental as it builds on existing methods with a new dataset.
The study explored using GPT-4 for thematic analysis on a YouTube dataset about Roma migrants in Sweden, finding that combining human and AI approaches can enhance scalability and efficiency in qualitative research within humanities and social sciences.
In the dynamic field of artificial intelligence (AI), the development and application of Large Language Models (LLMs) for text analysis are of significant academic interest. Despite the promising capabilities of various LLMs in conducting qualitative analysis, their use in the humanities and social sciences has not been thoroughly examined. This article contributes to the emerging literature on LLMs in qualitative analysis by documenting an experimental study involving GPT-4. The study focuses on performing thematic analysis (TA) using a YouTube dataset derived from an EU-funded project, which was previously analyzed by other researchers. This dataset is about the representation of Roma migrants in Sweden during 2016, a period marked by the aftermath of the 2015 refugee crisis and preceding the Swedish national elections in 2017. Our study seeks to understand the potential of combining human intelligence with AI's scalability and efficiency, examining the advantages and limitations of employing LLMs in qualitative research within the humanities and social sciences. Additionally, we discuss future directions for applying LLMs in these fields.