CL CYDec 21, 2023

ChatGPT as a commenter to the news: can LLMs generate human-like opinions?

Rayden Tseng, Suzan Verberne, Peter van der Putten

arXiv:2312.13961v10.910 citationsh-index: 17Has CodeMISDOOM

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of evaluating LLMs' ability to mimic human opinions for applications in automated content generation, but it is incremental as it confirms limitations in a specific domain.

The study investigated whether GPT-3.5 can generate human-like comments on Dutch news articles, finding that fine-tuned BERT models easily distinguished human-written from GPT-generated comments, with no prompting method performing better and human comments showing higher lexical diversity.

ChatGPT, GPT-3.5, and other large language models (LLMs) have drawn significant attention since their release, and the abilities of these models have been investigated for a wide variety of tasks. In this research we investigate to what extent GPT-3.5 can generate human-like comments on Dutch news articles. We define human likeness as `not distinguishable from human comments', approximated by the difficulty of automatic classification between human and GPT comments. We analyze human likeness across multiple prompting techniques. In particular, we utilize zero-shot, few-shot and context prompts, for two generated personas. We found that our fine-tuned BERT models can easily distinguish human-written comments from GPT-3.5 generated comments, with none of the used prompting methods performing noticeably better. We further analyzed that human comments consistently showed higher lexical diversity than GPT-generated comments. This indicates that although generative LLMs can generate fluent text, their capability to create human-like opinionated comments is still limited.

View on arXiv PDF Code

Similar