CLCYMar 31, 2024

Can Language Models Recognize Convincing Arguments?

arXiv:2404.00750v234 citationsh-index: 12EMNLP
Originality Synthesis-oriented
AI Analysis

This addresses concerns about LLMs' persuasive capabilities by providing a benchmark for evaluating their argument recognition, which is incremental as it builds on existing datasets and tasks.

The study investigated whether large language models (LLMs) can recognize convincing arguments by extending a dataset with debates, votes, and user traits, and found that LLMs perform on par with humans in tasks like distinguishing strong vs. weak arguments, with combined LLM predictions surpassing human performance.

The capabilities of large language models (LLMs) have raised concerns about their potential to create and propagate convincing narratives. Here, we study their performance in detecting convincing arguments to gain insights into LLMs' persuasive capabilities without directly engaging in experimentation with humans. We extend a dataset by Durmus and Cardie (2018) with debates, votes, and user traits and propose tasks measuring LLMs' ability to (1) distinguish between strong and weak arguments, (2) predict stances based on beliefs and demographic characteristics, and (3) determine the appeal of an argument to an individual based on their traits. We show that LLMs perform on par with humans in these tasks and that combining predictions from different LLMs yields significant performance gains, surpassing human performance. The data and code released with this paper contribute to the crucial effort of continuously evaluating and monitoring LLMs' capabilities and potential impact. (https://go.epfl.ch/persuasion-llm)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes