AIFeb 11, 2025

Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the ARC Task

arXiv:2502.07190v215 citationsh-index: 69NAACL
Originality Synthesis-oriented
AI Analysis

This work identifies critical gaps in LLMs' problem-solving abilities without prior knowledge, which is incremental as it builds on existing research about fluid intelligence assessments.

The paper analyzes LLMs' deficiencies in fluid intelligence by conducting controlled experiments on the ARC task, revealing three major limitations: limited skill composition, unfamiliarity with abstract input formats, and intrinsic left-to-right decoding issues.

While LLMs have exhibited strong performance on various NLP tasks, it is noteworthy that most of these tasks rely on utilizing the vast amount of knowledge encoded in LLMs' parameters, rather than solving new problems without prior knowledge. In cognitive research, the latter ability is referred to as fluid intelligence, which is considered to be critical for assessing human intelligence. Recent research on fluid intelligence assessments has highlighted significant deficiencies in LLMs' abilities. In this paper, we analyze the challenges LLMs face in demonstrating fluid intelligence through controlled experiments, using the most representative ARC task as an example. Our study revealed three major limitations in existing LLMs: limited ability for skill composition, unfamiliarity with abstract input formats, and the intrinsic deficiency of left-to-right decoding. Our data and code can be found in https://wujunjie1998.github.io/araoc-benchmark.github.io/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes