AICLLGFeb 17

Recursive Concept Evolution for Compositional Reasoning in Large Language Models

arXiv:2602.15725v1
Originality Highly original
AI Analysis

This addresses a critical bottleneck in AI reasoning for tasks requiring abstraction, offering a novel method rather than incremental improvements.

The paper tackles the problem of compositional reasoning in large language models, where performance degrades on benchmarks like ARC-AGI-2 and GPQA, by proposing Recursive Concept Evolution (RCE) to dynamically modify internal representations during inference, resulting in gains of 12-18 points on ARC-AGI-2 and 8-14 points on GPQA and BBH.

Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning, including ARC-AGI-2, GPQA, MATH, BBH, and HLE. Existing methods improve reasoning by expanding token-level search through chain-of-thought prompting, self-consistency, or reinforcement learning, but they leave the model's latent representation space fixed. When the required abstraction is not already encoded in this space, performance collapses. We propose Recursive Concept Evolution (RCE), a framework that enables pretrained language models to modify their internal representation geometry during inference. RCE introduces dynamically generated low-rank concept subspaces that are spawned when representational inadequacy is detected, selected through a minimum description length criterion, merged when synergistic, and consolidated via constrained optimization to preserve stability. This process allows the model to construct new abstractions rather than recombining existing ones. We integrate RCE with Mistral-7B and evaluate it across compositional reasoning benchmarks. RCE yields 12-18 point gains on ARC-AGI-2, 8-14 point improvements on GPQA and BBH, and consistent reductions in depth-induced error on MATH and HLE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes