CL HCMay 8, 2024

QuaLLM: An LLM-based Framework to Extract Quantitative Insights from Online Forums

Varun Nagaraj Rao, Eesha Agarwal, Samantha Dalal, Dan Calacci, Andrés Monroy-Hernández

arXiv:2405.05345v211.222 citationsh-index: 4NAACL

Originality Incremental advance

AI Analysis

This work addresses the need for scalable analysis of online forum data for researchers and regulators, though it is incremental in applying LLMs to a specific domain.

The study tackled the problem of extracting quantitative insights from online forums by introducing QuaLLM, an LLM-based framework, and applied it to over one million Reddit comments, uncovering significant worker concerns about AI and algorithmic decisions.

Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methodologies used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-based framework to analyze and extract quantitative insights from text data on online forums. The framework consists of a novel prompting and human evaluation methodology. We applied this framework to analyze over one million comments from two of Reddit's rideshare worker communities, marking the largest study of its type. We uncover significant worker concerns regarding AI and algorithmic platform decisions, responding to regulatory calls about worker insights. In short, our work sets a new precedent for AI-assisted quantitative data analysis to surface concerns from online forums.

View on arXiv PDF

Similar