CLOct 24, 2023

Prevalence and prevention of large language model use in crowd work

arXiv:2310.15683v15 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses the problem of LLM use compromising research validity in crowdsourcing for researchers and practitioners, though it is incremental in providing baseline data and mitigation insights.

The study found that large language model (LLM) use is prevalent among crowd workers, with an estimated 30% usage on a text summarization task, and that targeted mitigation strategies like disabling copy-pasting reduced this by about half.

We show that the use of large language models (LLMs) is prevalent among crowd workers, and that targeted mitigation strategies can significantly reduce, but not eliminate, LLM use. On a text summarization task where workers were not directed in any way regarding their LLM use, the estimated prevalence of LLM use was around 30%, but was reduced by about half by asking workers to not use LLMs and by raising the cost of using them, e.g., by disabling copy-pasting. Secondary analyses give further insight into LLM use and its prevention: LLM use yields high-quality but homogeneous responses, which may harm research concerned with human (rather than model) behavior and degrade future models trained with crowdsourced data. At the same time, preventing LLM use may be at odds with obtaining high-quality responses; e.g., when requesting workers not to use LLMs, summaries contained fewer keywords carrying essential information. Our estimates will likely change as LLMs increase in popularity or capabilities, and as norms around their usage change. Yet, understanding the co-evolution of LLM-based tools and users is key to maintaining the validity of research done using crowdsourcing, and we provide a critical baseline before widespread adoption ensues.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes