CL CYOct 15, 2024

On Classification with Large Language Models in Cultural Analytics

David Bamman, Kent K. Chang, Li Lucy, Naitian Zhou

Berkeley

arXiv:2410.12029v17.221 citationsh-index: 12CHR

Originality Synthesis-oriented

AI Analysis

This work addresses classification challenges for researchers in cultural analytics, but it is incremental as it surveys and assesses existing methods without introducing new techniques.

The paper tackled the problem of using large language models for classification in cultural analytics, finding that prompt-based LLMs are competitive with traditional supervised models on established tasks but perform worse on de novo tasks, and they can assist sensemaking by serving as an intermediary for theory testing.

In this work, we survey the way in which classification is used as a sensemaking practice in cultural analytics, and assess where large language models can fit into this landscape. We identify ten tasks supported by publicly available datasets on which we empirically assess the performance of LLMs compared to traditional supervised methods, and explore the ways in which LLMs can be employed for sensemaking goals beyond mere accuracy. We find that prompt-based LLMs are competitive with traditional supervised models for established tasks, but perform less well on de novo tasks. In addition, LLMs can assist sensemaking by acting as an intermediary input to formal theory testing.

View on arXiv PDF

Similar