CL AIApr 13

LLMs Struggle with Abstract Meaning Comprehension More Than Expected

arXiv:2604.1201821.4h-index: 1

AI Analysis

For NLP researchers, this work highlights a gap in LLMs' abstract reasoning and proposes a modest improvement, though the gains are incremental.

The paper investigates abstract meaning comprehension in LLMs, finding that even GPT-4o struggles under zero/few-shot settings while fine-tuned BERT/RoBERTa perform better. A bidirectional attention classifier improves accuracy by 4.06% on Task 1 and 3.41% on Task 2.

Understanding abstract meanings is crucial for advanced language comprehension. Despite extensive research, abstract words remain challenging due to their non-concrete, high-level semantics. SemEval-2021 Task 4 (ReCAM) evaluates models' ability to interpret abstract concepts by presenting passages with questions and five abstract options in a cloze-style format. Key findings include: (1) Most large language models (LLMs), including GPT-4o, struggle with abstract meaning comprehension under zero-shot, one-shot, and few-shot settings, while fine-tuned models like BERT and RoBERTa perform better. (2) A proposed bidirectional attention classifier, inspired by human cognitive strategies, enhances fine-tuned models by dynamically attending to passages and options. This approach improves accuracy by 4.06 percent on Task 1 and 3.41 percent on Task 2, demonstrating its potential for abstract meaning comprehension.

View on arXiv PDF

Similar